08-22-2016 10:16 AM
As in the title, we are planning a DR exercise where we plan to do the following, one at a time:
- Switch the GridMaster) and GridMaster Candidate (Promote the GMC for 15 minutes and then switch back
- We have two members performing External DNS Forwarding. We plan to stop DNS services on one of them for 15 minutes.
- We have two members handling Production DNS Resolution. We will be stopping DNS Services on one of them for 15 minutes.
- We have two members performing DHCP in a failover association. We plan on stopping DHCP services on one of them for 15 minutes.
All of these members are HA Pairs running NIOS 6.12.8.
I have been asked to reach out to the community to see if anyone has any experiences/recommendations doing something similar in the past that we should be aware of before performing this exercise.
Thank You in Advance,
08-23-2016 11:24 AM
I will allow for others to provide feedback on their actual experiences and testing methodologies but I will make one comment.
Promoting a GMC is a somewhat disruptive event as each member needs to re-establish its tunnel and then perform a full Grid sync from the new master and restart any appropriate services. This can be performed with a delay in-between members (you will be prompted to add a delay from the CLI when you run set promote_master). The delay will ensure that ALL members do not sync and restart at the same time. Without the delay the promotion could be disruptive to overall services which would likely not be desired for a DR test.
I hope that helps!
08-29-2016 11:46 AM
Thank you. We were already planning on putting two minutes between members. As this is an attempt to avoid problems others have experienced, we would like to keep this open for others to give us their experiences to avoid them as well.