1. Introduction
This technical guide will help you understand the underlying technology and technical concepts that relate to WANdisco's Subversion MultiSite software.
We'll be using terms like "node" and "replication group" without explaining what we mean. Check out the Glossary for an explanation of these and other WANdisco terms.
2. MultiSite Overview
Subversion is designed to run as a central server to which multiple Subversion clients connect. WANdisco's replication technology makes it possible to have multiple active replicas of a Subversion repository that are in synch. The Subversion replicas can be anywhere on a WAN - distributed throughout a company's campus or throughout the world. WANdisco users experience the performance of a local Subversion repository, with the semantics of a single shared Subversion repository. We call this "active replication with one-copy-equivalence."
Replication ensures that each replica acts as a hot backup to every other replica. If a local server experiences a problem that takes it offline, only local users are disrupted, the rest of the replication group continues as normal.
Example Replication Group
This illustration shows a replication group with five Subversion severs.
WANdisco offers a High Availability solution that ensures no disruption in service. Even if a Subversion server failed, a local backup takes over so service is uninterrupted. High Availability sub groups reside on a LAN, which can either be implemented stand-alone for a local Subversion server, or as part of the WAN-based MultiSite.
MultiSite with High Availability

This illustration shows a MultiSite group of six nodes at four locations, with two High Availability sub groups of two nodes. Each High Availability sub group contains a Failover Agent, a stateless member of the replication group.
Subversion MultiSite acts as a proxy between the Subversion Server and clients. An instance of the proxy runs at each replica. All the communication paths involved in the operation of WANdisco are illustrated in diagram above.
3. WANdisco MultiSite Concepts
All MultiSite nodes are synchronized at all times: each Subversion repository is a functional replica of the others. WANdisco replication technology is the concept of one repository, multiplied. Because there are multiple synchronized repositories, each replicated node is effectively a current hot backup, which makes disaster recovery easy to plan and implement.
The Subversion usernames and passwords on all repository hosts must match. This is required because MultiSite creates a peer-to-peer replication system. Any replica of the Subversion repository is accessible by every valid Subversion user. WANdisco offers the user the option of having MultiSite manage the Subversion password file.
3.1 How Replication Works
The sites in the replication group are continuously coordinating the Subversion write transactions users are making. The group establishes transaction ordering through the agreement of a quorum of replicas. When you install the first node, that node by default is the distinguished node with Singleton quorum. When you create the replication group that includes other nodes, you select the quorum type best suited to your configuration. For which quorum is best for particular situations, see Quorum Recommendations.
Singleton Quorum Singleton Response quorum - only one of the nodes in the membership decides on the transaction order. With Singleton Response quorum, the node that decides transaction ordering is called the distinguished node. The Singleton quorum offers the fastest response time for those users working at the distinguished node, because as soon as the distinguished node determines that a transaction can be processed in the correct order, the transaction is sent to Subversion. Any replicator except the distinguished node can go down, but the replication group continues. The replication group replays the missing transactions when that node rejoins the group. However, the Singleton quorum also represents a single point of failure, since replication halts if the distinguished node fails.
Majority Quorum Majority Response is another quorum option, whereby you specify that a majority of the sites must agree on transaction order before any transaction is committed. Having a majority quorum ensures that if one site goes down in a replication group, even the distinguished node, the other sites can continue uninterrupted, as long as a majority of the sites remain available. The replication group replays the missing transactions when that site rejoins the group.
In a majority quorum, the distinguished node?s role is that of a tie-breaker. For example, in a four node replication group, three sites make the quorum (three sites must agree about transaction ordering). If two nodes want one transaction first, and the other two want another transaction first, then the distinguished node gets a weighted vote. The group with the distinguished node determines the transaction ordering. With an even number of nodes with majority quorum, you can schedule the distinguished node to rotate to different nodes around the world, to "follow the sun".
Unanimous Quorum The last quorum option is unanimous response, which requires that all replicators must be reachable to accomplish transaction ordering.
3.2 Replication Example
Here is an overview of what occurs when a write transaction is received by any replicator in the replication group.
- The originating client sends the transaction to Subversion MultiSite, which passes it along through the replication group.
- Transaction data is successfully received by the quorum (the distinguished node for Singleton quorum, or a majority of sites for Majority quorum). The quorum assigns the transaction a Global Sequence Number (GSN).
- After receiving the transaction, each Subversion MultiSite passes the transaction data to its local Subversion server.
- Each local Subversion server processes the transaction.
- Subversion MultiSite waits for Subversion to complete the transaction. Subversion MultiSite only marks the transaction complete when Subversion returns a completion status. If for some reason replication goes down during this process (the replicator crashes, is stopped by an admin or the server it's running on shuts down), Subversion MultiSite does not mark the transaction as complete, and it gets reprocessed upon replication restart.
3.3 WANdisco is Listening
There is a field in the Admin Console that tells you whether WANdisco is accepting any incoming Subversion client requests. Replication still continues among the WANdisco sites, whether WANdisco is listening or not at one or more sites.
You can turn the listening on and off through the Admin Console (through the Start Proxy and Stop Proxy commands). Issuing the Stop Proxy command on a site puts that Subversion server in read-only mode.
The following illustration shows Sites 2 and 5 are not listening. (An administrator executed the Stop Proxy command for those sites.) Replication continues, and Sites 2 and 5 are still receiving and processing replicated transactions originating from the other sites. However, Subversion users at Sites 2 and 5 cannot make any write transactions. Once an administrator issues the Start Proxy command for Sites 2 and 5, the local Subversion users can again issue Subversion commands.
For High Availability sub groups, shutting down the Failover Agent stops WANdisco from accepting local client requests.
Please follow your company guidelines in regards to notifying Subversion users of maintenance.
3.4 Synchronized Stop of All Sites
When an administrator issues a synchronized stop command, the Subversion servers stop accepting write commands from clients. Pending transactions are processed, but no new write transactions are accepted. Subversion users continue to have read access to the repository, but cannot perform write operations, such as commit or lock.
When an administrator issues a resume command, the WANdisco proxies restart and begin accepting write transactions.
4. Handling Node and Network Failure
4.1 Node Failure
For Singleton Response quorum: say you have a five node replication group, spread across three continents. One of the nodes goes down. Replication continues at the remaining nodes, as long as the quorum is reached, although users connecting to the downed node are read-only until that node can reach the quorum.
As soon as a node comes back up, the replication group catches up the node on its missing transactions, so that all nodes are again synchronized.
For Majority Response quorum: say you have a five node replication group. One or two nodes could go down, and replication would continue at the other nodes, as long as a majority of nodes remain up. The one or two downed nodes go into read-only mode. As soon as a node comes back up, the replication group catches up the node on its missing transactions, so that all nodes are again synchronized.
4.2 Network Failure
If a network link goes down for one node, and outside connectivity is completely lost, there are two possible scenarios, depending on your quorum:
- If you have Singleton quorum, and the distinguished node's network link goes down, the distinguished node alone can make progress. The Subversion users local to the distinguished node continue working uninterrupted, while users at other sites in the replication group can make only read operations (like up, co, log, etc.) working with stale data.
- If you have Majority quorum, and one site's network link is lost, then users at that site can execute only read operations (like up, co, log, etc.) working with stale data. Providing that the remaining sites can still meet quorum (having a majority of sites responding), the other sites continue working uninterrupted
When connectivity is restored or the errored node is back online, the local node syncs up with the replication group automatically. First, the local node consults its local recovery journal (similar to a database redo log), and then, if necessary, attempts recovery from any of the quorum sites.
The recovery infrastructure and details of WANdisco fault-tolerance can be found at http://www.wandisco.com/pdf/dcone-whitepaper.pdf.
Copyright © 2010 WANdisco
All Rights Reserved
This product is protected by copyright and distributed under
licenses restricting copying, distribution and decompilation.