Skip to main content
Version: 2.2

Install Data Migrator

Ready to install? Check the prerequisites and then follow these steps to get up and running with Data Migrator.

Install Data Migrator

  1. Download Data Migrator and install it on your chosen host. If you're migrating from Hadoop Distributed File System (HDFS), install Data Migrator on an edge node in the Hadoop cluster. Run the following installation command:

    wget https://cirata.com/downloads/livedata-migrator.sh
  2. Make the installation script executable and install as the root (or sudo) user. These commands assume that the installer is inside your working directory.

    chmod +x livedata-migrator.sh && ./livedata-migrator.sh
    Alternative /tmp directory

    The Data Migrator installer extracts its contents to a temporary directory and decompresses them. By default, the temporary directory is a sub-directory of /tmp.

    In some situations, extracting and decompressing in the default temporary directory fails. For example, if there is not enough disk space remaining, or if /tmp is mounted as noexec.

    To avoid these issues, extract the contents to a different temporary directory by adding the --target option when you run the installer:

    Example (Using the full installer that contains packages for all supported platforms)
    ./livedata-1.21.0-4-full_rpm_installer.sh --target /opt/wandisco/alternate_tmp_dir

    Do not use /opt/wandisco/tmp as the value for --target or the installation will fail.

    You can delete your temporary directory and its contents after installation.

  3. Check the status of each service:

    Run the following command for each of the services, listed below
    systemctl status <service name>

    The service names are: livedata-migrator, hivemigrator, and livedata-ui.

    If your system is running without systemd the commands use a different form.

Adjust service file descriptor limit

For most production use cases we recommend increasing the Data Migrator service file descriptor limit. Contact Support if you have questions or concerns before changing this value.

To adjust the Data Migrator service file descriptor limit.

  1. On the command line, run the systemctl edit <service> command.
Example edit livedata-migrator service
systemctl edit livedata-migrator
  1. Add the following lines to adjust the file descriptor limit and save the changes.
Example file descriptor config values
[Service]
LimitNOFILE=64000
  1. Restart the Data Migrator service.
Example restart Data Migrator
systemctl restart livedata-migrator

These steps create a file to override the default Data Migrator (livedata-migrator) service file descriptors configuration located at:

/etc/systemd/system/livedata-migrator.service.d/override.conf

System and custom users for installation

To run the installer using a default user, run the following command:

./livedata-migrator.sh
ComponentDefault system userDefault system user group
Data Migratorhdfshdfs
WANdisco UIhdfshdfs
Hive Migratorhivehadoop

A source with a mounted network-attached filesystem (NFS) is unlikely to have the default users present. In this case, set up your own user and group. See Running a service as a different user or group.

User and group privileges

If your cluster is Kerberized, the default hdfs and hive users and groups have more privileges than are necessary. Increase security by using system user/groups that have only the minimum required permissions. Learn more.

Install with custom user

To install the product using a custom user or a custom user group, run the following commands:

Light installer (Obtains all packages from remote repository.)
./livedata-migrator.sh --user <custom user> --group <custom group>
Full installer (Contains all packages, no access to remote repository required.)
./livedata-x.x.x-xx-full_rpm_installer.sh -- --user <custom user> --group <custom group>

This sets the custom user and custom user group for all services and their respective directories.

For more information about configuring custom users, go to Configure system users.

info

The light installer obtains all packages from a remote repository. The file size is small, but it must be able to access the remote repository to download the packages. The full installer, either RPM or DEB, is larger at about 1.4GB but doesn't require any remote access to install.

Install Data Migrator components on separate hosts

To install the components on separate hosts, run the following commands:

Install the UI on one host

./livedata-migrator.sh --noexec --keep
cd ui_ldm_hvm
rpm -ivh livedata-ui-<version-number>.noarch.rpm

Install Data Migrator, Hive Migrator, and WANdisco CLI on one host

./livedata-migrator.sh --noexec --keep
cd ui_ldm_hvm
rpm -ivh livedata-migrator-<version-number>.noarch.rpm
rpm -ivh hivemigrator-<version-number>.noarch.rpm
rpm -ivh livedata-migrator-cli-<version-number>.noarch.rpm

Next steps

Once you have Data Migrator running, you're ready to get started.