Cluster continuous replication (CCR) combines automatic management of redundancy and application-level data replication. CCR is a solution that can be deployed with no single point of failure within a single data center or between two data centers. CCR uses three computers (referred to as nodes) joined in a single cluster. Two of the nodes host a clustered mailbox server (CMS). CCR uses the third node (referred to as the voter) to avoid an occurrence of network partition within the cluster, also known as split brain syndrome. Split brain syndrome occurs when all networks designated to carry internal cluster communications fail, and nodes cannot receive heartbeat signals from each other. Split brain syndrome is prevented by always requiring a majority of the three nodes to be communicating for the clustered mailbox server to be operational. When a majority of the nodes are communicating, the cluster is said to have a quorum. A node that is currently running a clustered mailbox server is an active node, and a node that is not running a clustered mailbox server is a passive node. MNS also offer the File share witness for the cluster quorum, this is the prefer method for CCR
There are specific steps that you perform to install the Mailbox Server role on the active and passive nodes, but there are no Microsoft Exchange installation or configuration tasks for the voter node. The installation of the voter is completed when the cluster node that does not act as an active or passive node is incorporated into the cluster. No further configuration or software installation is required for the voter node.
The file share for the witness must be configured on a different windows server computer in the domain. For Exchange 2007 the file witness share can be on a Hub server. This approach reduces hardware and support costs and is recommended for the MNS cluster.
A stand-alone (un-clustered) mailbox server adopts the network identity of its host computer. In a CCR environment, a clustered mailbox server’s network identity is designed to move between the nodes in a process known as failover. A clustered mailbox server’s network identity is its network name and IP address which is separate from the Node IP address and hostname. If the node running a clustered mailbox server experiences problems, the clustered mailbox server goes offline for a brief period until another node takes control of the clustered mailbox server IP address and Hostname and then brings the clustered resources online.
How CCR Replication Works
Transaction log replication and replay is used to copy the databases and maintain concurrency of the data between the nodes. Replication takes advantage of the change history produced by the Extensible Storage Engine (ESE). This change history is represented as a sequence of fixed-size (1024 k) log files. The replication service copies the log files to the passive node as each log file is generated. The replication mechanism is asynchronous to the online database. When the logs arrive at the passive node, they are inspected and replayed into the copy of the database that is stored on the passive node. The replay process makes the changes described in the change log to the passive node’s database, which makes the passive node’s database match the production database with a slight time lag.
Since the data is replicated between the nodes, the clustered mailbox server can operate on either of the two nodes. This capability provides increased availability because scheduled outages and failures of one node do not cause an extended outage of the clustered mailbox server. Assuming that the voter is still available and that it can communicate with an available passive node, the clustered mailbox server moves to the remaining node and continues to operate.
The replication service is responsible for creating a replica instance and its associated objects like log copier, log inspector and log replay as discussed posted in my earlier post.
In the case of the CCR replication all the active components of the replication are on the passive node. The copying of the log files are accomplished via a network share and uses he Windows networking protocol SMB. In CCR Architecture above, the arrow line between the log directory represents the copying of the logs from one machine to the other. The replication service will be running on both the nodes to facilitate the copy/inspect and replay of the log files. In the Exchange 2007 MNS cluster, CCR will always be on by default. Any time you create a new storage group/database, a cluster resource will be created for the database and replication will be setup automatically between the nodes.