WANdisco
 Navigation:  v | Release Notes | Install | Upgrade | Administration | Reference | Gerrit | API | Glossary | Archive

Reference Guide

1. UI tabs

1.1 DASHBOARD

The dashboard provides administrators with a service status for Git MulitSite and pulls in any urgent Log Messages. the dashboard

System status and log messages

1.2 Repositories

The tab under which you manage your replicated Git Repositories.

Add - Edit - Repair - Sync Stop - Git GC - Remove - Reload

repositoriestab

Repositories tab

1.2.1 Repository table

All repositories that you add to Git MultiSite will appear on the repository table.

Name
The Repository Name
Known issue: duplicate repository names allowed
It's currently possible to add multiple repositories with the same name (they'll need different paths though). Ensure that you don't use the same name for multiple repositories, this is for obvious reasons a bad practice and will be prevented in future releases.
Path
The local path to the repository. This needs to be the same across all sites
Replicator Group
The replication group in which the repository is replicated
Size
The file size of the repository
Youngest Rev
This is the latest revision number for the repository
Transactions
Lists any pending transactions. The transactions link to the last transactions played out for the repository:
edit repository 2

Click on a transaction box.

edit repository 2

Transactions list revealed!

Last Modified
The date and time of the last modification to the repository
Global RO
Indicates if the repository is globally read-only
Stops any further commits from Git users
The term Global Read-only doesn't accurately reflect what happens at the repository-level. When a repository enters a Global Read-only state it will no longer accept any commits from Git clients. However, proposals that are flying around within the state machine can still be written. It is this state that allows nodes to reach a synchronized stop.
Local RO
Indicates if the repository is locally read-only. A locally read-only repository is completely locked down, it does not accept new commits from Git clients but does accept futher changes from within the replication system.
Status
Indicates whether the repository is replicating or has stopped. A stopped repository will be in a read-only state, either globally or locally
Under control
Remember that this table doesn't automatically show all the repositories on the server, only those repositories that have been added. See Add Repository.

Add Repository

Click ADD in order to add a new repository to Git MulitiSite - the repository must already exist and be tested before you place it under the control of MultiSite.

edit repository 2
Repo name
Choose a descriptive name. This doesn't need to be the folder name, it can be anything you like.
FS Path
The local file system path to the repository. This needs to be the same across all sites.
Replicator Group
The replication group in which the repository is replicated. It is the replication group that determines which sites hold repository replica, and what role each replica plays.
Global Read-only
Check box that lets you add a repository that will be globally read-only. In this state Git MultiSite continues to communicate system changes, such as repository roles and scheduling, however, no repository changes will be accepted, either locally or through proposals that might come in from other sites - which in most cases shouldn't happen as by definition the repository should also be read-only at all other sites.
Global Read-only
You can think of the Global Read-only flag as quick means of locking down a repository, so that no commits will be accepted at any site.
Add Repo
Click the Add Repo button when you have entered all the required fields for the repository that you are adding. You can cancel the addition of the repository by clicking on the circular cross icon that appears on the left-hand side of the entry fields.
Filter Repositories
You can use this search box to filter the list of available repositories, useful if you're running with a large number of repositories.

Edit Repositories

It's possible to make changes to a repositories settings after it has been added. This is how:

Repair

tip

The Repository Repair tool is used when a repository on one of your nodes has been corrupted or similarly requires repair or replacement. Selecting a repository to repair, the tool asks to you select a 'helper' node. This node will briefly stop replicating as it will be used to copy or rsync an up-to-date replica of the broken repository onto the current node.

Sync Stop

The Sync Stop tool lets you bring replication to a stop for a selected repository. The tool is required to ensure that when replication has stopped all repository replica remain in exactly the same state. This requirement is complicated within distributed systems where proposals may be accepted on some nodes while still in-flight to other nodes. See Performaing a synchronized stop

Git Garbage Collection

The Git Garbage Collection tool triggers a clearout of temporary files associated with proposals that have been successfully played out on each node in the repository's replication group. Read more about Garbage Collection in our Knowledge Base Git Garbage Collections

Remove

Use the remove button after selecting a repository to remove the selected repository from MultiSite's control. The repository data will not be deleted but once deleted, changes made to the repository locally will no longer be replicated to other nodes. See 8. Removing Repositories

Reload

Click the Reload button to force the Repository screen to refresh so that any recent changes are displayed.

1.3 Replication Groups

Replication Groups are units of organization that we use to manage replication of certain repositories between certain sites. In order to replicate a Git repository between two or more nodes, you would need to associate by adding them to a replication group:

1.3.1 Example replication groups

An organization with developers working in Chengdu and San Fransisco need to collaborate on projects stored in two Git repositories, Repo1 and Repo3. An administrator in the Chengdu office creates a replication group called "DragonGroup". The Git MultiSite nodes corresponding with each of the two sites are added to the group.

edit repository 2

Replication Group Example 1

The Chengdu office is the location of largest development team, where most repository changes occur. For this reason the node is assigned the role of Tie-breaker. In the event that there is disagreement between the nodes in the group over transaction ordering, NodeChengdu will carry the deciding vote.

The node in San Fransisco hosts a standard active node. Changes to the local repository are replicated to NodeChengdu, changes made on the Chengdu node are replicated back to San Fransisco.


In addition to the two active nodes, a third node, NodeParis is added, located at a management site that plays no active part in development. However, the node has been added to the group as a "pure voter". This means that NodeParis takes part in the vote for transaction ordering, even though the payload of those transactions are not written to repository replicas stored at the Paris office. The purpose of NodeParis is simply to add resilience to the replication system, in the event of a short-term disruption to traffic from one of the other two nodes, agreement can still be reached and replication could continue.

The organization might choose to make the Paris node "Passive" instead. With ParisNode running a passive node, replicas of Repo1 and Repo3 would also be stored in Paris, although these copies could only be changed by updates from the other (active) repository copies, they wouldn't be accessible to Git clients.

Types of node

There's another element controlled by replication groups, the role that each repository replica plays the replication system, where each replica may be casting a vote in order to determine the correct order in which proposals are played out. See more about Types of Sites.

1.3.2 Create replication group

You can create a replication group providing that you have at least two sites connected.

  1. Click Create Replication Group.
    edit repository 2
  2. Replication Groups


  3. Enter a Group Name, it can be anything but it's a good idea to keep it simple.
    group name
  4. Group names


  5. Select from the available sites. As you select sites they'll be listed in the add node's field.
    group name

    Add some sites - you need a minimum of two

  6. The local node is automatically made the first member
    It's not possible to create a replication group remotely, the initiating node is alsways added automatically.
  7. Added sites are by default given the role Active Voter. Click on a node's label to bring up a list of available roles. See Types of Sites.
    site type
  8. Replication groups

  9. When you have added all the nodes that you need, click on the Create Replication Group. Replication groups are listed in boxes on the Replication Group tab.

View

You can view and partly edit a replication group by clicking on the view button.

  1. Click on the Quick View button.
    site type
  2. The replication group's screen will open showing the member nodes of the group.
    site type
  3. Each node is desplayed as a color-coded circle. Click on the circle to see what other node types are available. Read more about node types.
    site type
  4. The Configuration screen provides access to the each node's type, along with a list of repositories and a link to the Configure Schedule screen.

Add Nodes

It's possible to add additonal nodes to a replication group. Click on the Add Nodes button to start the procedure, you can read about it in the Admin Guide - 13. Adding a node to a replication group

Save Node Roles

Use this button to save any changes that you make to the member nodes.

Configure Schedule

The Schedule screen lets you set the roles of sites to change over time, specifically changing according to a schedule.

Why change a node's role?

At the heart of WANdisco's DConE2 replication technology is an agreement engine that ensures that Git operations are performed in exactly the same order on each replica, on each node. Any node that has the role of voter becomes part of the agreement engine and together with other voters determine the correct ordering. If there's high latency between any voters this may adversely affect replication performance. Fortunately it isn't a requirement that every site takes part in forming agreement. An Active site can still create proposals (i.e. instigate repository changes) but the agreement engine doesn't need to wait for its vote. Read more about how replication works in the Reference Section.

Follow the Sun

So in order to optimize replication performance it's common for administrators to remove voter status from node after their staff leave for the day - a practice commonly known as "Follow The Sun" where far-flung organisations transfer roles and privileges between locations so that they are always held by an active, manned site.

site type

Role Schedule
The Role Schedule window shows all the sites in the replication group, along with each node's current roll (denoted by a circular buttons). It's possible to click on a node and change its role as a Schedule Manager.
site type
If you change any sites you need to click the SAVE SITE ROLES button. to ensure that your changes are saved.
There's an indicator as to whether a Role Schedule has been enabled on the group. If there's a schedule in place the indicator will show ENABLED, otherwise it will show DISABLED.
site type
Clicking on the indicator will open up the Schedule screen, the same as if you clicked on the CONFIGURE SCHEDULE button.
Schedule
A drop-down selector for viewing a particular day of the week. This filter is only for viewing an established schedule, it isn't used to select the day that you intend to schedule.
Clear Schedule
Use this button to blank out settings you have changed, returning to the default schedule.
Save Schedule
After making any changes to the schedule, you need to click the SAVE SCHEDULE to apply them. Next to the button is a reminder of the server's timezone, this is derived from the server's system clock, not the location information provided during installation.

To see how to setup a schedule, see the Git MultiSite Admin Guide - How to configure a schedule

1.4 Settings

The server's internal settings are reported on the Settings tab, along with a number of important editable settings.

settings

Administrator Settings

Change Administrator Settings
Use this form to change the current administrator password, this is the password that you provided during installation that is used to log in to the Git MultiSite Plus browser-based User Interface
See How to change your Administrator Password.
User Interface HTTP Port
Change the port that you want to use to access the User Interface. Just enter a valid port number and clock SAVE.

Resource Monitoring Data

settings

The Resource Monitoring Data settings provide a basic tool for monitoring available disk storage for MultiSite's resources.

Monitor Interval (mins)
If the disk space available to a monitored resource is less than the value you have for a "Severe" event then the event is logged and MultiSite's replicator will shut down after this interval, currently set at 10 minutes by default. You can configure the interval in application.properties file:

/opt/wandisco/git-multisite/replicator/properties/application.properties
resourcemonitor.period.min=10
Value is in minutes, and only run through the UI, it is not handled directly by the replicator.
Add New Resource Monitor
Enter the path to a resource that you wish to monitor, then click "Add".
Resource Monitors
This section lists all resources currently being monitored. Click on "Configure" to change monitor settings, "Delete" to remove a monitor.

For more information about setting up monitors, read 22. Setting up resource monitoring.

Notifications

settings - notifications

The notifications system provides Git administrators with the ability to create event-driven alert emails. Set up one more more gateway (mail servers), add destination emails to specify recipients, create email templates for the content of the alert emails, then set the rules for which event should trigger a specific email.

Gateways

settings - gateways

The Gateways section stores the details of those email relay servers that your organization uses for internal mail delivery. You can add any number of gateways, Git MultiSite will attempt to delivery notification emails using each gateway in their order on the list, #0, #1, #2 etc.

Git MultiSite will attempt delivery via the next gateway server once it has attempted delivery a number of times equal to the Tries number. It will repeat a try after waiting a number of settings equal to the Internval setting.

How Git MultiSite gives up on delivering to a gateway
Example. Gateway #0 is offline. With Tries set to 5 and Interval set to 600, MultiSite will attempt delivery using the next gateway (#1) after 600s x 5 = 50 minutes.
IP/Hostname of SMTP Server:
your email server's address.
SMTP Server Port
The port assigned for SMTP traffic (Port 25 etc).
Encryption Type
Indicate your server's encryption type - None, SSL (Secure Socket Layer) or TLS (Transport Layer Security). SSL is a commonly used. For tips on setting up suitable keystore and truststore files see Setting up SSL Key pair.

If you're not familiar with the finer points of setting up SSL keystores and truststores it is recommended that you read the following article: Using Java Keytool to manage keystores.
Authentication Required
Indicate whether you need a username and password to connect to the server - requires either "true" or "false".
User Name
If authentication is required, enter the authentication username here.
Password
If authentication is required, enter the authentication password here.
Sender Address
Provide an email address that your notifications will appear to come from. If you want to be able to receive replies from notifications you'll need to make sure this is a valid and monitored address.
Number of Tries Before Failing
Set the number of attempts Git MultiSite makes in order to send out notifications.
Interval Between Tries (Seconds)
Set the time (in seconds) between your server's attempts to send notifications.

Destinations

settings - destinations

In the Destinations panel is used to store email address for notification recipients. Add, Edit or remove email addresses.

Templates

settings - templates

The templates panel is used to store email content. You create messaging to match those events for which you want to send user notifications.

Template Subject
Use this entry field to set the subject of the notification email. You'll want this subject to be descriptive of the event for which the email will be triggered. .
Body Text
Enter the actual message that you want to send for a particular situation/event.

Rules

settings - rules

Use the Rules panel to actually setup up your notification emails. Here you'll associate email templates and destination emails with a particular system event. For example, you may create an email message to send to a particular group mailing list in the event that a repository goes into Read-only mode. Selecting descriptive subjects for your templates will help you to select the right templates here.

Event
Choose from the available list of trigger events.
Template
Destination

Logging Setting

The Logging Setting lets you quickly add or modify Java loggers via the admin console, rather than making manual edits to the logger file:

<install-dir>/replicator/properties/logger.properties.

settings - Logging Settings

Loggers are usually attached to packages, here, the level for each package is available to modify or delete. The global level is used by default, so changes made here are used to override the default values. Changes are applied instantly but in-memory only and are forgotten after a restart of the replicator (unless they are saved). For information about adding or changing loggers, see 34. Logging Setting Tool.

settings - entry of logging

System Data

settings - System Data

The System Data table provides a list of read-only settings that were either provided during setup or have since been applied:

NodeID
The name that you gave the node during installation.
Location ID
A unique string that is used to identify the server during an induction.
Host Name/IP Address
The Hostname / IP address of the server hosting the node.
DConE port
DConE port (actually it's DConE 2) handles agreement traffic between sites. The Default is 6444.
Database Location
The full path to MultiSite's database. By default this will be <install-dir>/replicator/database.
Jetty HTTP Port
The HTTP port is used for browser access to the User Interface.
Content Server Port
The port that will be used to transfer replicated content (repository changes). This is different from the port used by WANdisco's DConE agreement engine.
Content Location
The directory in which replication data is stored (prior to it having been confirmed as replicated).
License information
Details concerning Git MultiSite product license - such as the date of expiry.

View REST API Documentation - This link takes you to your node's local copy of the API documentation. The link goes to the following location: http://<Node IP>:8080/apidocs/. This documentation is generated automatically and ties directly into your server's local resources. There is an only copy of the latest API documenation available in this admin guide, note though that it has been lifted from an installation and will link to resources that will not be available on the website (resulting in dead links).

Module Versions

The module versions provides a list of the component parts of the Git MultiSite application. This is useful if you need to verify what version of a component you are using - such as if you need to contact WANdisco for support.

1.5 SECURITY

The security tab is used to manage admin accounts, either entered manually into Git MultiSite or managed through an LDAP authority. On the tab is an entry form for multiple admin accounts, along with LDAP Settings for binding MultiSite to one or more LDAP services.

Acc

Git MultiSite - Security

Add User
Enter the details of an additional administrator who will be able to login to the Git MultiSite Admin UI. See Adding additional users for more information.
Add Authority
Enter the details of one or more LDAP authorities for managing administrator access. See Adding LDAP authorities for more information.
Disable Managed Users
This feature lets you block access to the Git MultiSite Admin UI by non-LDAP users. See Disabling (Internally) Managed Users below.
Enable SSO
This button will only be available to click if you have entered valid Kerberos settings. When enabled it places Git MultiSite's admin console into Single Sign-on mode. When enabled accessing the admin UI will use Kerberos instead of the username and password login form. In the enabled state the button will change to say Disable SSO.
Export Security Settings
The data entered into the Securities tab can be backed up for later re-importing by clicking the Export Security Settings button. The data is stored in /opt/wandisco/git-multisite/replicator/export/security-export.xml which should be included in any backup procedures you are running. You will need access to the file from your desktop during a re-import.
Import Security Settings
Click the Import Security Settings button if you need to restore your Security settings, such as after a re-installation of Git MultiSite. The import will proceed providing that you can enter a file path to the security-export.xml file.
Reload
Click on the reload button to refresh the Admin UI screen, you will need to do this in order to view any changes that you make.

Admin Account Precedence

Git MultiSite uses the following order of precedence when checking for authentication of users:

Explanation

This provider implementation tries to authenticate user credentials against either the list of internally managed users, or against any number of LDAP authorities, or both -- depending on how the administrator has configured the application.

When authenticating against LDAP authorities, each one is tried in sequence until one either grants access or they all deny access. In the event that they all deny access, only the error from the last authority tried will be returned.

Admin Accounts
  • Admin account changes are replicated to all nodes.
  • Changes to admin accounts are handled as proposals that require agreement from a majority of every node in the replication network.
  • Admin account changes are reported into the audit log.

Disable (Internally) Managed Users

Click the Disabled Managed Users button if you want to control access to Git MultiSite exclusively through LDAP. Once clicked, any Internally managed users will no longer be able to log into the Admin UI after they next log out. From that point only LDAP managed users will have access to the Git MultiSite Admin UI.

Re-enable Internally Managed users
If, after disabling Internally Managed Users you need to enable them again -- should there be a problem with your LDAP authorities -- then it is possible to enable access again by logging into the node via a terminal window (with suitable permissions), navigate to the following directory:
/opt/wandisco/git-multisite/replicator
and run the reset script:
java -jar resetsecurity.jar
Any internally managed users who remain in Git MultiSite's database will have their access restored.

Internally Managed Users

Acc

Git MultiSite Plus - Internally Managed Users

This table lists those admin users who have been entered through the Admin UI or imported using the Import Security Settings, along with the first admin account.

Admin Account #1

Note that the first admin account is the one set up during the installation of your first node. The credentials specified during this installation are stored to the users.properties file which is then used during the installation of all subsequent nodes.

Admin Account Mismatch
The users.properties file is used to ensure that exactly the same username/password is used on all nodes during installation. In the even that there's a mismatch then you wouldn't be able to connect the nodes together (through the Induction process). Rather than clean-up and reinstall you can fix this by manually syncing the password files.

Admin Account #1 can be removed but the last admin account remaining on the system will not be deletable to ensure that it isn't possible for an administrator to be completely locked out of the admin UI.

Kerberos

Support for the Kerberos protocol is now included. When enabled, Kerberos handles authentication for access to the admin UI, where the administrator is automatically logged in if their browser can retrieve a valid Kerberos ticket from the operating system.

You can't mix and match log-in type
When Kerberos SSO is enabled only users who are set up for Kerberos will be able to access the admin UI. The username and password login form will be disabled. If you ever need to disable Kerberos authentication this can be done using the authentication reset script which will return your deployment to the default login type.
Kerberos

kerberos settings entry form

Service Principal
A service principal name (SPN) is the name that a client uses to identify a specific instance of a service.

Example:
HTTP/host.example.com
Keytab File
The keytab is the encrypted file on disk where pairs of Kerberos principals and their keys are stored.

Example:
/tmp/krb5.keytab
Kerberos 5 Realm Configuration File
The krb5 configuration file location of the replicator host's Kerberos 5 realm configuration.

Example:
/etc/krb5/krb5.conf
Never replicated, always configured 'per-node'
Kerberos configuration is not replicated around the replication network because each node in the network needs its own host-specific configuration. This configuration is node-local only. The configuration needed is the host-specific service principal name, noted in the settings above. e.g.

On most systems the location of the host's encrypted key table file will be something something like:
/etc/krb5.keytab
The location of the host's Kerberos 5 realm configuration may be something like /etc/krb5.conf or /etc/krb5/krb5.conf

LDAP Authorities

Kerberos

LDAP Authority entry forms

Node-Local LDAP Authorities
If chosen, then only the local node will use the LDAP authority for authentication.
Replicated LDAP Authorities
If replicated is chosen, all nodes in the replication network can use the LDAP authority for authentication.

Mixing Local and WAN-based authorities

Both kinds of authority are supported simultaneously, with the node-specific LDAP authorities taking precedence over WAN-based authorities in order to support the use-case where, for example, a particular node may prefer to use a geographically closer LDAP directory. Also, if multiple LDAP authorities of either type are configured then the order in which they are consulted is also configurable, using the +/- buttons at the end of each entry.

Order
LDAP authorities are listed in the order of execution that you set when defining each authority's properties.
Url
The URL of the authority. The protocol "ldap://" or "ldaps://" are required.
Bind User DN
Identify the LDAP admin user account that Git MultiSite will use to query the authority.
Search Base
This is the Base DN, that is the location of users that you wish to retrieve.
Search Filter
A query filter that will select users based on relevant LDAP attributes. For more information about query filter syntax, consult the documentation for your LDAP server.
Remove
Click to remove the authority from Git MultiSite.
Edit
Click to make changes to the authority's settings.

The usual configuration options are supported for each configured LDAP authority: URL, search base and filter and bind user credentials. Note that the bind user's password cannot be one-way encrypted using a hash function because it must be sent to the LDAP server in plain text, so for this reason the bind user should be a low privilege user with just enough permissions to search the directory for the user being authenticated. Anonymous binding is permitted for those LDAP servers that support anonymous binding.

LDAP Home or away

When adding an LDAP authority, the configuration can be selected to be either replicated or node-specific.

Replicated LDAP Authorities

If node-specific is chosen, then only the local node will use the LDAP authority for authentication. Both kinds of authority are supported simultaneously, with the node-specific LDAP authorities taking precedence over WAN-based authorities in order to support the use-case where a particular node may prefer to use a geographically closer LDAP directory, for example. Also, if multiple LDAP authorities of either type are configured then the order in which they are consulted is also configurable.

The usual configuration options are supported for each configured LDAP authority: URL, search base and filter and bind user credentials.

Just enough permissions
The bind user's password cannot be one-way encrypted using a hash function because it must be sent to the LDAP server in plain text. For this reason the bind user should be only have enough privileges to search the directory for the user being authenticated. Anonymous binding is permitted for those LDAP servers that support anonymous binding.

2. Architecture overview

The diagram below outlines the Git MultiSite architecture, in term of how the application is split up and how those component parts communicate with each other and the outside world.

settings

Key points

  • Admin UI and Replicator are run in separate Java processes.
  • The Admin UI interacts with the application thought the same API layer that is available for external interactions. This layer enforces seperation of concerns and handles authentication and authorization of all user interactions.
  • Git server runs with a WANdisco version of the FSFS server libraries. Whilst Read operations are passed to SVN's regular FSFS, writes are delegated to the Git MultiSite replicator.
  • The DConE 2 Coordination protocol handles the agreement of transaction ordering between nodes via port 6444. The delivery of the actual replicated content (Git commits etc) is handled by the Content Distribution layer on port 4321.

3. Install directory structure

Git MultiSite is installed to the following path by default:

/opt/wandisco/git-multisite/
It's possible to install the files somewhere else on your server, although this guide will assume the above location when discussing the installation.

Inside the installation directory you'll find the following files and directories:

drwxr-xr-x  2 root     root     4096 Oct 24 07:55 bin
	-r-xr-xr-x 1 root root  9551 Oct 22 10:26 backup
	-r-xr-xr-x 1 root root 23260 Oct 22 10:26 emr_driver
	-r-xr-xr-x 1 root root 20328 Oct 22 10:26 emr_nodes
	-r-xr-xr-x 1 root root  3516 Oct 22 10:26 gitms_shell
	-r-xr-xr-x 1 root root  1686 Oct 22 10:26 git-multisite
	-r-xr-xr-x 1 root root 20878 Oct 22 10:26 remove_nodes_eco_system
	-r-xr-xr-x 1 root root 19278 Oct 22 10:26 remove_nodes_replication_groups
	-r-xr-xr-x 1 root root  1620 Oct 22 10:26 rollback
	-r-xr-xr-x 1 root root 12349 Oct 22 10:26 sysInfo.sh
	-r-xr-xr-x 1 root root 24368 Oct 22 10:26 talkback
	-r-xr-xr-x 1 root root  4093 Oct 22 10:26 watchdog
	
drwxr-xr-x  2 wandisco wandisco 4096 Oct 24 07:55 config
	-rw-r--r-- 1 root root 317 Oct 24 07:55 main.conf
	
drwxr-xr-x  2 root     root     4096 Oct 24 07:55 lib
	-r--r--r-- 1 root root 20719 Oct 22 10:26 init-functions.sh
	-r--r--r-- 1 root root   127 Oct 22 10:26 lsb-functions.sh
	
drwxr-xr-x  3 wandisco wandisco 4096 Oct 24 07:56 local-ui
	drwxr-xr-x 2 wandisco wandisco 4096 Oct 24 07:55 ui-logs
	-rw-r--r-- 1 wandisco wandisco  384 Oct 24 07:58 ui.properties
	
drwxr-xr-x  2 wandisco wandisco 4096 Oct 24 07:56 logs
	-rw-r--r-- 1 wandisco wandisco   88 Oct 24 07:56 multisite.log
	-rw-r--r-- 1 wandisco wandisco    0 Oct 24 07:56 replicator.20141024-075611.log
	-rw-r--r-- 1 wandisco wandisco 3078 Oct 24 07:56 ui.20141024-075542.log
	-rw-r--r-- 1 wandisco wandisco 2041 Oct 24 07:56 watchdog.log
	
drwxr-xr-x 13 root     root     4096 Oct 24 07:55 replicator
	drwxr-xr-x 2 wandisco wandisco   4096 Oct 24 07:55 content
	drwxrwxr-x 2 wandisco wandisco   4096 Oct 24 07:55 content_delivery
	-rw-r--r-- 1 root     root        129 Sep 26 11:20 COPYING
	drwxr-xr-x 5 wandisco wandisco   4096 Oct 24 07:56 database
	drwxr-xr-x 3 root     root       4096 Oct 24 07:55 docs
		drwxr-xr-x 4 root root 12288 Oct 24 07:55 api
	drwxr-xr-x 2 wandisco wandisco   4096 Oct 24 07:55 export
	drwxr-xr-x 7 root     root       4096 Oct 24 07:55 gfr
	-rw-r--r-- 1 root     root     381022 Sep 26 11:31 git-ms-replicator-1.4.0-1693.jar
	-rw-r--r-- 1 root     root     381029 Sep 26 11:31 git-ms-replicator-1.4.0-SNAPSHOT-updateinetaddress.jar
	-rw-r--r-- 1 root     root     380728 Sep 26 11:31 git-ms-replicator-gitmsrestore.jar
	drwxr-xr-x 2 wandisco wandisco   4096 Oct 24 07:55 hooks
	drwxr-xr-x 2 root     root       4096 Oct 24 07:55 lib
	-rw-r--r-- 1 root     root      44869 Sep 26 11:20 LICENSE
	drwxr-xr-x 4 wandisco wandisco   4096 Oct 24 07:56 logs
	-rw-r--r-- 1 root     root     741999 Sep 26 11:28 NOTICE
	drwxr-xr-x 2 wandisco wandisco   4096 Oct 24 07:56 properties
		-rw-r--r-- 1 wandisco wandisco  954 Oct 24 07:56 application.properties
		-rw-r--r-- 1 wandisco wandisco  256 Oct 24 07:55 license.key
		-rw-r--r-- 1 wandisco wandisco  443 Jul 24 20:33 log4j.properties
		-rw-r--r-- 1 wandisco wandisco 1752 Sep  4 11:18 logger.properties
		-rw-r--r-- 1 wandisco wandisco  188 Oct 24 10:30 users.properties	
	drwxr-xr-x 2 root     root       4096 Oct 24 07:55 properties.dist
	-rwxr-xr-x 1 root     root       7683 Oct 22 10:26 resetSecurity.jar
	-rwxr-xr-x 1 root     root      25056 Oct 22 10:26 transformer-tool.jar
	-rw-r--r-- 1 root     root        544 Sep 26 11:28 VERSION-TREE
	
drwxr-xr-x  3 wandisco wandisco 4096 Oct 24 07:55 tmp
drwxr-xr-x  2 root     root     4096 Oct 24 07:55 ui
drwxr-xr-x  3 wandisco wandisco 4096 Oct 24 07:55 var

4. Properties files

The following files store application settings and constants that may need to be referenced during troubleshooting. However, you shouldn't make any changes to these files without consulting WANdisco's support team.

/opt/wandisco/git-multisite/replicator/properties/application.properties
file contains settings for the replicator and affects how MultiSite performs. [view sample]
Temporary requirement:
If you (probably under instruction from WANdisco's support team) manually add either connectivity.check.interval or sideline.wait to the applications property file then you must add an "L" (Long value) to the end of their values so they are converted correctly. View our sample application.properties file to view all the properties that are suffixed as "Long".
/opt/wandisco/git-multisite/replicator/properties/logger.properties
handles properties that apply to how logging is handled. [view sample]
/opt/wandisco/git-multisite/replicator/properties/replicator-api-authorization.properties
handles properties that are used for managing remote access through the API. [view sample]
/opt/wandisco/git-multisite/replicator/properties/users.properties
contains the admin account details which will be required when installing second and subsequent nodes. [view sample]
/opt/wandisco/git-multisite/local-ui/ui.properties
contains settings concerning the graphical user interface such as widget settings and timeout values. Stored in this file is the UI Port number and is considered the defacto recording of this value, superceding the version stored in the main config file /opt/wandisco/git-multisite/config/main.conf. You can [view a sample]

5. Replication strategy

Git MultiSite provides a toolset for replicating Git repository data in a manner that can maximise performance and efficiency whilst minimising network and hardware resources requirements. The following examples provide you with a starting point for deciding on the best means to enable replication across your development sites.

5.1 Replication Model

In contrast with earlier replication products, Git MultiSite Plus is no longer based upon a network proxy that handles file replication between replica. Now, replication is handled at the filesystem level, via FSFS.
** This is history **

Git MultiSite differs from earlier WANdisco replication products on a number of levels.

Limitations of the old model

** This is history **

Git MultiSite differs from earlier versions in that it replicates at the file system level.

5.1.1 Per-Repository Replication

Git MultiSite is able to replicate on a per-repository basis. This way each site can host different sets of repositories and replicate repositories this means that you can have different repositories replicate as part of different replication groups.
** Alert! **

5.1.2 Dynamic membership evolution

** evolution without stopping work **

No need for a synchronized stop - Git MultiSite allows replication groups to change their membership on-the-fly.

A repository can only replicate to a single replication group at any one time, although it is possible to move between replication groups as required - this can now be done on-the-fly, nodes can be added or deleted without the need to pause all replication (with a synchronized stop)

Git MultiSite offers a great deal of flexibility in how repository data is replicated. Before you get started it's a good idea to map out which repositories at which locations you want to replicate.

5.1.3 WANdisco replication and compression

There are a number of WAN network management tools that offer performance benefits by using data compression. The following guide explains how data compression is already incorporated into WANdisco's replication system, and what effect this built-in compression may have on various forms of secondary compression.

Network management tools may offer performance benefits by on-the-fly compression of network traffic, however it's worth nothing that WANdisco's DConE replication protocol is already using compression for replicated data. Currently Zip compression is used before content is distributed using the Content Distribution component of DConE.

Traffic Management systems that provide WAN optimization or "WAN Acceleration" may not provide expected benefits as a result of WANdisco's compression. The following list highlights where duplication or redundancy occurs.

Compression
Encoding data using more efficient storage techniques so that a given amount of data can be stored in a smaller file size.
WANdisco effect: As replicated data is already compressed, having a WAN accelerator appliance compress the data again is a waste of time - however, as long as it can "fill the pipe", i.e. keep the throughput of traffic faster rate than the network can consume it then its not going to negatively impact data transfer.
Deduplication
Eliminating the transfer of redundant data by sending references instead of the actual data. By working at the byte level, benefits are achieved across IP applications. In truth, Data-deduplication offers the most benefit when there's a lot of repetition in the data traffic.
WANdisco effect: Because the data is already compressed then data-deduplication (by whatever WAN optimization solution) will not be effective, when data is compressed any small change at the start of the data stream propagates through the data stream and defeats data-deduplication. You can read more about this effect in this external article - rsyncable-gzip
Latency optimization
Various refinements to the TCP implementation (such as window-size scaling, selective Acknowledgement etc.
WANdisco effect: DConE does not use TCP / network layer techniques. This form of optimization won't have any impact on WANdisco Replication.

5.2 Creating resilient replication groups

Git MultiSite is able to maintainvrepository replication (and availability) even after the loss of nodes from a replication group. However, there are some configuration rules that are worth considering:

Rule 1: Understand Learners and Acceptors

The unique Active-Active replication technology used by Git MultiSite Plus is an evolution of the Paxos algorithm, as such we use some Paxos concepts which are useful to understand:

Rule 2: Replication groups should have a minimum membership of three learner nodes

Two-node replication groups are not fault tolerant, you should strive to replicate according to the following guideline:

Rule 3: Learner Population - resilience vs rightness

Rule 4: 2 nodes per site provides resilience and performance benefits

Running with two nodes per site provides two important advantages.

5.3 Content distribution strategy

WANdisco's replication protocol separates replication traffic into two streams, the coordination stream which handles agreement between voter nodes and the content distribution stream through which Git repository changes are passed to all other nodes (that is "learner" nodes that store repository replicas).

** evolution without stopping work **

Git MultiSite provides a number of tunable settings that let you apply different policies to content distribution. By default, content distribution is set up to prioritize reliability. You can, instead, set it to prioritize speed of replication. Alternatively you can apply a policy that prioritizes delivery of content to voter nodes. These policies can be applied on a per-site, per-repository and replication group basis providing a fine level of control providing that you take care to set things up properly.

5.3.1 Changing content distribution policy

To set the policy, you need to make a manual edit to Git MultiSite's Application properties file:

/opt/wandisco/git-multisite/replicator/properties/application.properties
Changes require a restart:
Changing the current strategy requires the modification of properties files that the replicator only reads at start-up. As a result any changes to strategy require that the replicator be restarted before the change will be applied.

Editable content distribution properties

content.push.policy=faster
content.min.learners.required=true
content.learners.count=1
content.thread.count=10

Above is an example Content Distribution Policy. We'll breakdown what each of the settings does:

content.push.policy

This property sets the priority for Content Distribution. It can have one of three options which set the following behavior. Each option tells MultiSite to use a different calculation for relating replication agreement to the actual delivery of replicated data.

"reliable" Policy:
Replication can continue if content is available to a sufficient number of learners (the value of content.learner.count, not including the node itself).

"Reliable" is the default setting for the policy. Content must be delivered to a minimum number of nodes, the value of the property (content.min.learners.required, for a proposal to be accepted - which will allow replication to continue.

Reliable because it greatly reduces the risk that a proposal will be accepted when the corresponding content cannot be delivered (due to a LAN outage etc). This policy is less likely to provide good performance because replication is kept on hold until content has been successfully distributed to a greater number of nodes than would be the case under the "faster" policy.

Setting the corresponding "content.learner.count" value
  • This value is the number of learners (exluding the originating node) to which content is delivered before a proposal will be raised for agreement.

  • If content.min.learners.required is false then the system will automatically lower content.learner.count to ensure that replication can continue om the event of a loss of node(s).

  • If the value is higher than the number of available nodes then Git MultiSite is notified of the failure and the value is dropped further, ( based on 'content.min.learners.required').

  • If content.min.learners.required is true the then Git MultiSite is notified of a failure to summon enough voters for agreement - content.leaner.count is automatically dropped again (to equal one less than the number of available learners), or if that's no longer possible, a failure is reported.

Setting the corresponding "content.min.learners.required" value

For "reliable" policy that offers the upmost reliability, set this to "true".

true enforces the requirement
When content.min.learners.required is set to "true" Git MultiSite will not lower the content.learner.count in light of insufficient learner nodes being available.

Example:

content.learner.count=5, content.min.learners.required=true

After an outage there are now only 4 learner nodes in the replication group - replication will be halted because there aren't enough available learners to validate a content distribution check.

content.learner.count=5, content.min.learners.required=false

After an outage there are now only 4 learner nodes in the replication group - replication will not be halted because Git MultiSite Plus will automatically drop the count to ensure that it doesn't exceed the total number of learners in the group.
"acceptors" Policy:
Voting can commence if content is delivered to 50% of voters (acceptors), include self if a voter

Content push policy only deals with delivering content to voters. This policy is ideal if you have a small number of voters. You don't want replication to continue until you have confirmed that at least half the voters have the corresponding payload. This policy supports a "follow-the-sun" approach to replication where the voter status of nodes changes each day to correspond with the locations where most development is being done. This ensures that the sites with the most development activity will get the best commit performance.

Setting the corresponding "content.learner.count" value

For "Acceptors" policy this is ignored.

Setting the corresponding "content.min.learners.required" value

For "Acceptors" policy this is ignored - learners do not factor into the policy, only voters (acceptors).

"faster" Policy:
Replication can continue if content available to x learners (not including self)
OR [delivered to half the voters, including self if its a voter]
where x = content.learner.count

Setting the policy to 'faster' lowers the requirement for successful content distribution so that replication can continue when fewer nodes (than under the reliable policy) have received the payload for the proposal. 50% of voters (acceptors) must receive the content. It's faster because if there's a slow or intermittent connection somewhere on the WAN, it wouldn't delay agreement/ slow down replication. It is less reliable because it increases the possibility that the ordering of a proposal can be agreed upon, only for the corresponding content to not get delivered. The result would be a break in replication.

Under the "Faster" policy "content.min.learners.required" can be set to "True" or "False" as the learners check takes a secondary role. If set to "True", even if the conditions are not satisfied then you can still achieve quorum.

Setting the corresponding "content.learner.count" value
  • For the 'faster' policy the node in question is always included in the count. In the event that this is not satisfied, a further check is made against acceptors. The check passes if half or more (rounded up) of the available voters took delivery of the content.

Setting the corresponding "content.min.learners.required" value

For Faster, set this to "false".

All the acceptors (voters) must also be learners (carry replica data that can be changed).
If all the acceptors are not learners, we switch to 'reliable' policy with a log message. A node always includes itself in the count - in contrast with the "reliable" policy where a node never includes itself in the count.

5.3.2 Steps for a policy change

Use this procedure to change between the above Content Distribution policies.

  1. Make a back up and then edit the /opt/wandisco/git-multisite/replicator/properties/application.properties file (Read more about the properties files).

  2. Change the value of content.min.learners.required, make it "true" for reliability, "false" for speed (default is true).

  3. Save the file and perform a restart of the node.

5.3.3 Set policy on a per-state machine basis

When editing the property, add the state machine identity followed by '.content.push.policy'. e.g.

<machine_identity_name>.content.push.policy=faster

The system assigns policy by looking up the state machine policy followed by 'content.push.policy'. If none are available, 'reliable' is chosen. Conditional switch between 'faster' and 'reliable' remains in effect regardless of the policy.

Example 1 - Faster policy on a 2-node replication group

Two-node Replication Group, NodeA (Active Voter) and NodeB (Active).

content.push.policy=faster
content.min.learners.required=false
content.learners.count=1

Example 2 - Acceptors policy on a 4-node replication group

Four nodes split between two sites. On Site 1 we have NodeA and NodeB, both are Active Voters. On site 2 we have NodeC (AV) and NodeD (A).

content.push.policy=acceptors
content.min.learners.required=true
content.learners.count=1

content.thread.count

Content Distribution will attempt parallel file transfer if there are enough threads available. The number of threads is controlled by a configuration property "content.thread.count" which is written to the application.properties file.

content.thread.count=10

The default value is 10. This provides plenty of scope for parallel file transfer. However, as each thread consumes system overhead in the form of a file descriptor and some memory space, servers that are under regular heavy load should lower the count to 2.


5.4 Dealing with replication lag

There are some time-sensitive activities where you need to work around replication lag.

Example

This lag is unavoidable in any real-world application and all replica should soon be back in sync.

workaround

However, there are some activities such as checkpointing, tagging or kicking off a build where you want to ensure that you're up-to-date at a particular node. It's possible to build some checks into your deployment that use the REST API to call each repository copy to verify if there are still changes in transit. Specifically it's the repositories endpoint that offers a count of pending transactions at the node. We will look to enhance this particular endpoint in the future to include more information like the size of pending data transfers.

Example API Return

http://<SERVER-ADDRESS>:8082/api/repository/search?filesystemPath=/opt/Git/Repo123& withPendingTransactions=true

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<svn-repository>
    <dsmId>f20992b7-fc7a-11e3-88fc-002590c13198</dsmId>
    <fileSystemPath>/opt/Git/Repo3893</fileSystemPath>
    <globalReadOnly>false</globalReadOnly>
    <latestRevision>
        <revisionNum>8993</revisionNum>
        <size>13930000</size>
        <timestamp>1402537258207</timestamp>
    </latestRevision>
    <localReadOnly>false</localReadOnly>
    <membershipId>4a4963f8-fc7a-11e3-88fc-002590c13198</membershipId>
    <name>Repo3893</name>
    <pendingTransactions>7</pendingTransactions>
    <readOnlyReason></readOnlyReason>
    <replicationGroupId>4abe1e8c-fc7a-11e3-88fc-002590c13198</replicationGroupId>
    <repositoryIdentity>f20992b6-fc7a-11e3-88fc-002590c13198</repositoryIdentity>
    <state 
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="deployedStateDTO"/>
    </svn-repository>    

6. Guide to node types

Each replication group consists of a number of nodes and a selection of repositories that will be replicated.

There are now some different types of site:

** node **Active
An Active node has users who are actively committing to Git repositories, which results in the generation of proposals that are replicated to the other nodes. However, it plays no part in getting agreement on the ordering of transactions. Active nodes support the use of the Consistency Checker tool.
** node **Active Voter
An Active Voter is an Active node that also votes on the order in which transactions are played out. In a replication group with a single Active Voter, it alone decides on ordering. If there's an even number of Active Voters, a Tiebreaker will have to be specified. Active nodes support the use of the Consistency Checker tool.
** node **Passive
A node on which repositories receive updates from other nodes, but doesn't permit any changes to its replicas from Git clients - effectively making its repositories read-only. Passive nodes are ideal for use in providing hot-backup. Passive nodes do not support the reliable use of the Consistency Checker tool.
** node **Passive Voter
A passive node that also takes part in the vote for transaction ordering agreement.

Use for:
  • Dedicated servers for Continuous Integration servers
  • Sharing code with partners or sites that won't be allowed to commit changes back
  • In addition, these nodes could help with HA as they add another voter to a site.
  • Passive nodes do not support the reliable use of the Consistency Checker tool.

  • ** node **Voter (only)
    A Voter-only node doesn't store any repository data, it's only purpose is to accept transactions and cast a vote on transaction ordering. Voter-only nodes add resilience to a replication group as they increase the likelihood that enough nodes are available to make agreement on ordering.

    The Voter-only node's lack of replication payload means that it can be disabled from a replication group, without being removed. ** node **
    A disabled node can be re-enabled without the need to interrupt the replication group.
    ** node **Tiebreaker
    In the event of even number of voters in the Replication Group the Tiebreaker gets the casting vote. The Tiebreaker can be applied any type of voter: Active Voter, Passive Voter or Voter.
    ** node **Helper
    When adding a new site to an existing replication group you will select an existing site from which you will manually copy or rsync the applicable repository data. This existing site enters the 'helper' mode in which the same relevant repositories will be read-only until they have been synced with the new site. By relevant we mean that they are replicated in the replication group in which the new site is being added.
    ** node **New
    When a site is added to an existing replication group it enters an 'on-hold' state until repository data has been copied across. Until the process of adding the repository data is complete, New nodes will be read-only. Should you leave the Add a Node process before it has completed you will need to manually remove the read-only state from the repository.

    Acceptors, Proposers and Learners?

    The table below shows which node roles are acceptors, proposers or learners.

    Node Type Acceptor Proposer Learner
    Active (A) N Y Y
    Active Voter (AV) Y Y Y
    Passive (P) N N Y
    Passive Voter (PV)Y N Y
    Voter Only (V) Y N N

    Key

    Learners are either Active or Passive nodes:
    Learns proposals from other nodes and takes action accordingly. Updates repositories based on proposal (replication).
    Proposers are Active nodes:
    To be able to commit to the node the node must be able to make proposals.
    Acceptors are Voters:
    Accepts proposals from other nodes and whether or not to process or not (ordering/achieve quorum).