Install Data Migrator
Ready to install? Check the prerequisites and then follow these steps to get up and running with Data Migrator.
Install Data Migrator
Download Data Migrator and install it on your chosen host. If you're migrating from Hadoop Distributed File System (HDFS), install Data Migrator on an edge node in the Hadoop cluster. Run the following installation command:
wget https://cirata.com/downloads/livedata-migrator.sh
Make the installation script executable and install as the root (or sudo) user. These commands assume that the installer is inside your working directory.
chmod +x livedata-migrator.sh && ./livedata-migrator.sh
Alternative /tmp directoryThe Data Migrator installer extracts its contents to a temporary directory and decompresses them. By default, the temporary directory is a sub-directory of
/tmp
.In some situations, extracting and decompressing in the default temporary directory fails. For example, if there is not enough disk space remaining, or if
/tmp
is mounted asnoexec
.To avoid these issues, extract the contents to a different temporary directory by adding the
--target
option when you run the installer:Example (Using the full installer that contains packages for all supported platforms)./livedata-1.21.0-4-full_rpm_installer.sh --target /opt/wandisco/alternate_tmp_dir
Do not use
/opt/wandisco/tmp
as the value for--target
or the installation will fail.You can delete your temporary directory and its contents after installation.
Check the status of each service:
Run the following command for each of the services, listed belowsystemctl status <service name>
The service names are: livedata-migrator, hivemigrator, and livedata-ui.
If your system is running without systemd the commands use a different form.
Adjust service file descriptor limit
For most production use cases we recommend increasing the Data Migrator service file descriptor limit. Contact Support if you have questions or concerns before changing this value.
To adjust the Data Migrator service file descriptor limit.
- On the command line, run the
systemctl edit <service>
command.
systemctl edit livedata-migrator
- Add the following lines to adjust the file descriptor limit and save the changes.
[Service]
LimitNOFILE=64000
- Restart the Data Migrator service.
systemctl restart livedata-migrator
These steps create a file to override the default Data Migrator (livedata-migrator) service file descriptors configuration located at:
/etc/systemd/system/livedata-migrator.service.d/override.conf
System and custom users for installation
To run the installer using a default user, run the following command:
./livedata-migrator.sh
Component | Default system user | Default system user group |
---|---|---|
Data Migrator | hdfs | hdfs |
UI | hdfs | hdfs |
Hive Migrator | hive | hadoop |
A source with a mounted network-attached filesystem (NFS) is unlikely to have the default users present. In this case, set up your own user and group. See Running a service as a different user or group.
If your cluster is Kerberized, the default hdfs
and hive
users and groups have more privileges than are necessary. Increase security by using system user/groups that have only the minimum required permissions. Learn more.
To install the product using a custom user or a custom user group, run the following commands:
./livedata-migrator.sh --user <custom user> --group <custom group>
./livedata-x.x.x-xx-full_rpm_installer.sh -- --user <custom user> --group <custom group>
This sets the custom user and custom user group for all services and their respective directories.
For more information about configuring custom users, go to Configure system users.
Install Data Migrator components on separate hosts
To install the components on separate hosts, run the following commands:
Install the UI on one host
./livedata-migrator.sh --noexec --keep
cd ui_ldm_hvm
rpm -ivh livedata-ui-<version-number>.noarch.rpm
Install Data Migrator, Hive Migrator, and CLI on one host
./livedata-migrator.sh --noexec --keep
cd ui_ldm_hvm
rpm -ivh livedata-migrator-<version-number>.noarch.rpm
rpm -ivh hivemigrator-<version-number>.noarch.rpm
rpm -ivh livedata-migrator-cli-<version-number>.noarch.rpm
Next steps
Once you have Data Migrator running, you're ready to get started.