What to Do If You Lose Quorum (Server Clusters: Majority Node Set Quorum)
Forcing quorum is a manual process that requires the following steps
Stop the cluster service ON ALL of the remaining nodes using cluster administrator.
The cluster service must be told which nodes should be considered as having quorum. This can be done in one of two ways:
Setup ForceQuorum registry key ON ALL remaining nodes in the cluster under
This is a REG_SZ key that should be setup to contain a comma separated list of the names of the nodes that are to have quorum. The key is case insensitive. So, in the above example, if the secondary site contains "Node5", "Node6" and "Node7", then the ForceQuorum registry key should be setup as
|There should be no spaces in the key (except where there are spaces in the node names themselves).|
Once the registry keys are set on all nodes, the cluster service can be started on those nodes.
Setup the cluster service startup parameters ON ALL remaining nodes in the cluster. This is done by starting up the services control panel, selecting the cluster service and entering the following into the "start parameters" option:
/forcequorum <node list>
In the above example, if the secondary site contains "Node5", "Node6" and "Node7", then the cluster service start parameter should be set to:
The cluster service MUST be started by clicking the START button on the service control panel, you must not hit OK or Apply first as this does not preserve the parameters.
|Any command line parameters over-ride the registry setting, however, the command line parameters do NOT persist a reboot, and therefore, setting the registry key is the preferred mechanism for forcing quorum.|
The cluster service will now start up on those nodes that are considered part of the quorum set and resources will be brought online.
Special care must be taken if and when the primary site comes back since the nodes are configured as part of the cluster.
Do NOT reboot the cluster nodes at the primary site
Stop the cluster service ON ALL of the cluster nodes
Remove the registry key setting or the cluster service startup parameters set to force quorum
Startup the cluster service on all of the nodes at the secondary site
Boot the nodes at the primary site
|The cluster service on all nodes NOT in the force quorum node list must remain stopped until the force quorum information is removed. Failure to do so can lead to data inconsistencies OR data corruption.|
While a cluster is running in the force quorum state, it is fully functional. For example, nodes can be added or removed from the cluster; new resources, groups etc. can be defined.
The only problem I could not get the above commands to work on a 64-bit Windows Server 2003 R2, Enterprise Edition SP2 machine. I most got invalid syntax.
Here is what to do:
1. We shutdown one of the nodes, a true power off. We will call this the passive node.
2. We added the following value to this registry key on the surviving node (active node):
HKLM/System/CurrentControlSet/Services/Clussvc/Parameters3. Replace nomenamea with the machines name, such as exch2007nodea - where this is the node that is currently running.
4. We attempted to start the cluster service on the active- surviving node and it started.
5. We then stopped the cluster service on the active - surviving node and added nodenameb to the ForceQuorum data value on the surviving node.
6. We restarted the powered off (passive) machine.
7. We then started the cluster service on the active node and it started. The registry with the ForceQuorum containing both node names.
8. We attempted to start the cluster service on passive (with no parameters or registry changes) and it started.
9. We verified that the Cluster group resources were online.
10. Undo the registry changes by deleting the ForceQuorum key from the Active node.
Exchange Server 2007 System Attendant fails to come online within a CCR/SCC cluster
After the cluster was up and running, the Exchange SA was not. Looking in the Application event log and we were getting the following errors with regards to the Exchange SA failing to start:
Event ID 1011, 1030, 1003, and 1019 errors.
We found that a bug exists where the Exchange SA times out after 40 seconds when the default of 180 seconds is used for the resource.
We changed the value to 179 and the Exchange SA resource came online. This is scheduled to be fixed in SP1. This bug was confirmed for SCC & CCR Exchange Server 2007 Clusters.