Install data transfer agents
Data transfer agents let you scale beyond the limitations of a single host by sharing the workload of transferring data across additional hosts with access to your source. Data transfer agents accelerate data transfer by removing network, memory, and CPU bottlenecks.
This means Data Migrator can be scaled to the capacity of your wide area network or another limitation, such as the data transfer capability of your storage environment.
You can add an unlimited number of data transfer agents to Data Migrator.
Data transfer agents are not supported with local fs sources.
Data transfer agents assist with moving data but aren't involved in scaling metadata migrations, which are typically under a much lower load and don't experience bottlenecks from being deployed on a single host.
If you decide to use data transfer agents, agents become responsible for transferring the data to your target filesystem(s). Data Migrator doesn't transfer the data. If you stop the agents, your data isn't migrated. If you want to stop using agents, and use Data Migrator to move the data, you must deregister your agents.
See the section Remove an agent.
If you want to manage traffic, do so on standalone instances of Data Migrator using bandwidth limits.
Older operating systems (CentOS 6) may not have the required libraries to provide secure channel communication. See the following Knowledge base article to disable secure channel communication.
Bandwidth management. When you add one or more data transfer agents, adjust the configuration of each agent to control the bandwidth limit. See the section Data Migrator with agent bandwidth
Prerequisites
You are the system administrator.
Additional hosts are deployed on your network that can access the source storage environment.
If you're transferring data from a Hadoop environment, you have the default name
hdfs
for your system user and system group.
See Configure system users.Port 1433 is open between the host running Data Migrator and all hosts running data transfer agents.
See Network requirements.
The agent must have the same version as Data Migrator. If the agent version isn't the same, the agent status isn't recognized as valid.
Recommended specifications
For recommended machine specifications, see the Installation Prerequisites.
Install an agent
Install each agent on a separate host with client access to the source filesystem.
Download either the RPM or the Debian installer for your selected host. Links to the applicable version of the data transfer agent are available if you select Add a data transfer agent on the data transfer agent page in the UI.
- RPM
curl -L -o livedata-migrator-data-agent-installer.sh https://docs-router.wandisco.com/router?url=/data-transfer-agents/download?version=1.26.2-rpm
- Debian
curl -L -o livedata-migrator-data-agent-installer.sh https://docs-router.wandisco.com/router?url=/data-transfer-agents/download?version=1.26.2-deb
Make the installer executable:
chmod +x livedata-migrator-data-agent-installer.sh
Run the installation command as the root (or sudo) user:
./livedata-migrator-data-agent-installer.sh
By default, the agent will be installed using the hdfs
user and hdfs
group.
To install the agent with a non-default system user see Custom user installation below.
Custom user installation
To install the agent using a custom user or a custom user group, use the following command:
./livedata-migrator-data-agent-<version>_installer.sh -- --user <custom user> --group <custom group>
Authentication
A successful installation prints the authentication token in your terminal for Data Migrator to authenticate to the agent.
The format of the generated token is:
----- BEGIN AGENT TOKEN
----<example_token_text>----
END AGENT TOKEN -----
The authentication token is generated with the hostname of the system in the /etc/hostname
file. The hostname may be different to the hostname of the node on the network. To ensure the node is valid when added to the UI, the generator includes a script - gen-certs-and-token.sh
- that allows you to create custom hostnames and ports manually.
Run the script as described in steps 1 to 3 above.
Custom user authentication
If you are a custom user, run the following commands:
chown <custom_user>:<custom_group> gen-certs-and-token.sh
sudo -u <custom_user> ./gen-certs-and-token.sh --force
The output details the directories in which agent files are stored. See the output example below.
Token Generator
Certificates and keys will be stored in /opt/wandisco/livedata-migrator-data-agent/certs,
Keystore = keystore.p12,
Data Agent Host = example01-vm3.bdauto.wandisco.com,
Data Agent Server Port = 1433
Connection Token will be stored in /opt/wandisco/livedata-migrator-data-agent/connection_token,
Security config file is /etc/wandisco/livedata-migrator-data-agent/security.properties,
Registration request file is /opt/wandisco/livedata-migrator-data-agent/reg_data_agent.json
(Re)start the livedata-migrator-data-agent service and call the livedata-migrator API to (re)register an agent:
curl -XPOST -H "Content-Type: application/json" -d @/opt/wandisco/livedata-migrator-data-agent/reg_data_agent.json http://host:18080/scaling/dataagents/
Agent connection token:
----- BEGIN AGENT TOKEN -----
H4sIAAAAAAAAAKVU246jOBT8lVFes2kIhA7ZN2NubtokBEJCxAu3kHCxczcw2n9fMtvq6d3eWbW0SJY4JUy5TtXx90ESwex8Hfw+GD0eRTOQ/Q1qSw/pCAJP+4GGBCOkrDsIgQNzwJAC8n55wFLy/LQvi/nCcVRQgAt2KbNAoPp9rUL9FJk+iVTQhOR1s73iVjpsDX2cGvItFfTb1qi6QJBvgTC7OrzGzH1i4wIzXGg89lA79/B43WO2ill/hu4D+h9cv2IKyUeutQc8JU/eTo+Un0qwojAbAuAm0qYqd8PVcHlEbPecJCF5jizhwux8bNdc5zISFMCdK8OgKdrphnbEjVM6FVbxfkHq4jky7Ik/WyxdpvOFRd14ExJ9foK0wUzzwELJbd9cYgXsZE3xgAock8OAGhCeDBdPZgrAEPAlwFoAkaPK91mmhaQ15TMThI0/Pwrnl5ktO1q8tRCvG+Jq9lLulKWBZjnaA9xxXJtPuaPv3x0q3txaaHeRFZLaB/j4Kpyz6lCvpRMoynYRkh9Wa7b6L/YPfhsk1SEj169mxe+gwp16RbkG+6X+UxWGrLEKQB8d6GuM1kearlOKVe0akrTmu9f6pQw2Ph+3khispTLa2EX/XsTCWHU8kOmMb3EHWuw5DS6QhDs/6rFuroLe6Y8ohnxjdeD4zlX/5PoVU0jeuLDnoa9kZWnsxJRVkqhf6HAOV2ZI5Iln22fTZyC249eiSlUd+Qt4n7GJu94F4nAWeWdnu/IzSeN3t7FQ0+gK+f0aVCsOllJIjI0V0QYwtQD4Y1b6SQOfc5L3OVlBpJ/lcabrHNW4kDQ8OSjSaeMV4yZfqFCqptVWm2d3uV7S67wDBw3I5zIvls8pnqeK7XFmvqu76vVe3l9qPyTEHU5l/uZeWc3xeRDeeF5Mvx4WK2v/npfFEvn9t98sLXjPi+H1GeklYkNp/1LU1+/qANNMxSxZ+rh3HCdvkgbRObjZnXQyPDsIiRRIxRA11YWqDNzchi0mDc/F1+5g5DAPkEW3qCv4fr6y/RJ89iskD8f+j18heTj28OtDaz5LfW+NmyXn7DFJs3j3PJ5Ok5E4nkqjiTzJRtE0k0dS1F84cRRnkRz3u6K837Sgj+EbT0TxDTDp5fGPYn/L9/x4dK/FpziNblf6xCKSHi4JfUpoPfjjT0r88TbmBQAA
----- END AGENT TOKEN -----
LiveData Migrator Data Agent installed successfully.
You don't need to start the livedata-migrator-data-agent
service.
Copy the authentication token from the output in your terminal.
Alternatively, you can find the generated token in /opt/wandisco/livedata-migrator-data-agent/reg_data_agent.json
.
- UI
- CLI
- REST API
Configure the agent with the UI
Add an agent with the UI
To add an agent with the UI, follow these steps:
- Select your Data Migrator product from the Instances list in the dashboard.
- Under Filesystems and Agents, select Data Transfer Agents.
- Select Add a data transfer agent.
- Follow the steps in the UI to install the agent in your terminal. See Install an agent.
- Paste the authentication token from the output in your terminal into the Authentication Token field.
Data Migrator derives the hostname and the port number from this token and populates the corresponding fields in the UI automatically. - Select Save.
Check the status of an agent with the UI
To check if the status of the connection to an agent is healthy:
- From the Dashboard, select an instance under Instances.
- Under Filesystems and Agents, select Data Transfer Agents. The Data Transfer Agents page lists whether the agents are healthy or unhealthy under the Status column.
Check the progress of an agent migrating data with the UI
To check the progress of an agent migrating data:
- From the Dashboard, select an instance under Instances.
- Under Filesystems and Agents, select Data Transfer Agents.
- Select an agent from the Data Transfer Agents page.
- On the Data Transfer Agent Progress panel, view the following:
- Data transferred in the past 24 hours (up to the last hour)
- Data transferred in the past week
Metrics appear only after the agent begins to transfer data. You can also view bandwidth usage for your configured agents. An overview of migrated data is available on the Metrics tab for the product.
Set up email notifications
To get notified by email about the status of agents, go to the Email Notifications page to set up these alerts.
For more information, see Configure email notifications.
Remove an agent with the UI
To remove an agent, follow these steps:
- From the Dashboard, select an instance under Instances.
- Under Filesystems and Agents, select Data Transfer Agents.
- Select the agent from the list that you want to remove.
- In the panel Data Transfer Agent Connection, select Remove.
The agent no longer appears in the list.
Configure the agent with the CLI
Add an agent with the CLI
agent add --agent-name <example_agent_name> --agent-token-file /opt/wandisco/livedata-migrator-data-agent/connection_token
Check the status of an agent with the CLI
agent show --agent-name <example_agent_name>
Check the progress of an agent migrating data with the CLI
To check the progress of data transferring with an agent, use the following command:
agent stats summary --agent-name <example_agent_name>
{
"lastWeekTotalBytes": 20971520,
"lastDayTotalBytes": 20971520,
"agentName": "<example_agent_name>"
}
Metrics appear only after the agent begins to transfer data.
If an agent hasn't been used yet to transfer data, you see the following output:
"error" : 15004,
"status" : 404,
"title" : "Data Agent not found",
"message" : "Data agent with given name not found"
Remove an agent with the CLI
agent delete --agent-name <example_agent_name>
For more information, see the section on data transfer agents in the Command reference.
Configure the agent with the REST API
Add an agent with the REST API
If you install an agent on a different node than Data Migrator, you need to expose Data Migrator's REST API. If you don't do this before adding the agent with the REST API, the connection between the agent and Data Migrator won't work.
To expose the REST API externally, navigate to
/etc/wandisco/livedata-migrator/application.properties
and delete or comment out theserver.address=127.0.0.1
line.Restart Data Migrator to apply the change.
For more information, see System service commands - Data Migrator.
Add the downloaded agent by running the curl command on the same node on which it's installed.
curl -X POST -H "Content-Type: application/json" -d @/opt/wandisco/livedata-migrator-data-agent/reg_data_agent.json http://<ldm-hostname>:18080/scaling/dataagents/
Example output{
"name" : "agent-example-vm.bdauto.wandisco.com",
"id" : "example-vm.bdauto.wandisco.com:1433",
"host" : "example-vm.bdauto.wandisco.com",
"port" : 1433,
"type" : "GRPC",
"version" : "2.1.0",
"healthy" : true,
"health" :
{ "lastStatusUpdateTime" : 1670348953607, "lastHealthMessage" : "Agent example-vm.bdauto.wandisco.com:1433 - health check became OK", "status" : "CONNECTED" }Check the agent was added by running the curl command on any node of the cluster.
curl -X GET http://<ldm-hostname>:18080/scaling/dataagents/example_agent_name
Example output{
"name" : "agent1",
"host" : "example-vm.bdauto.wandisco.com",
"port" : 1433,
"type" : "GRPC",
"version" : "2.1.0",
"healthy" : true,
"health" : {
"lastStatusUpdateTime" : 1673539013554,
"lastHealthMessage" : "Agent agent1 - health check became OK",
"status" : "CONNECTED"
}
migrator-host
is the host where Data Migrator is installed.
If basic authentication is enabled, use the following commands:
curl -u example_user:example_password -X POST -H "Content-Type: application/json" -d @/opt/wandisco/livedata-migrator-data-agent/reg_data_agent.json http://migrator-host:18080/scaling/dataagents/
curl -u example_user:example_password -X GET http://<ldm-hostname>:18080/scaling/dataagents/example_agent_name
REST API endpoints
The REST API for data transfer agents is documented as a swagger endpoint:
http(s)://<ldm-hostname>:18080/swagger-ui/index.html?configUrl=/v3/api-docs/swagger-config#/Data%20Agents%20Controller
To enable swagger documentation, follow the steps in the API reference section of the product user guide.
To make manual API calls, you can also use the web interface of the swagger-based REST API documentation similar to the documentation for the data migrations API.
Uninstall an agent
To uninstall an agent, use one of the following commands:
yum remove -y livedata-migrator-data-agent
Or
apt-get purge -y livedata-migrator-data-agent
(Optional) Delete all related agent directories:
rm -rf /etc/wandisco/livedata-migrator-data-agent /var/log/wandisco/livedata-migrator-data-agent /var/run/livedata-migrator-data-agent /opt/wandisco/livedata-migrator-data-agent
Upgrade an agent
To upgrade to the latest version of the data transfer agent, download and run a new installer (see install an agent). After upgrading, the agent version is updated automatically in the UI.
If you've got only one agent, there's a short period of time when the agent version and the Data Migrator version don't match. This may temporarily affect your filesystems or migrations until the versions are matched. To prevent any disruption to migrations, stop all data migrations before you upgrade. Resume stopped data migrations manually after the versions match.
For more information on stopping and resuming data migrations manually, see Bulk actions.
If you've got multiple agents, don't upgrade them all at once. When upgraded agents match the Data Migrator version and resume transferring data, you can continue to upgrade the other agents. This way, upgrading to a newer version of your agents won't affect your filesystems or migrations. Any migrations that are in progress continue transferring data as normal.
After upgrading, old agent values in the vars.env
file are replaced by new ones, and new properties are added to the end of the file.
The minimum support version of Data Migrator with data transfer agents is version 1.22.0.
Learn more
Configure data transfer agent properties
Secure communication
Troubleshooting