Upgrade Data Migrator
We recommend you regularly upgrade Data Migrator so you can take advantage of new functionality and other improvements. To upgrade, run through the prerequisites covered below and then run a newer version of the Data Migrator installer. The installer upgrades your Data Migrator instance to the new version.
Before you upgrade
Read through the following section before you begin a product upgrade.
When you upgrade, you'll probably need to make some configuration changes.
Files in the /etc/wandisco
directory contain custom configuration changes. The files generally used for configuration are:
/etc/wandisco/livedata-migrator/application.properties
/etc/wandisco/livedata-migrator/vars.env
/etc/wandisco/ui/application-prod.properties
/etc/wandisco/ui/vars.env
/etc/wandisco/hivemigrator/application.properties
/etc/wandisco/hivemigrator/vars.sh
/etc/wandisco/hivemigrator/user-vars.sh
Existing configuration
New versions can introduce additional configuration properties and improved default values. Compare your existing configuration and apply any new properties or applicable values supplied with the new configuration files included with your latest version. Check the release notes for changes to these files and make any changes before you restart services.
RPM
For RPM-based installations, your modified configuration is preserved when an RPM upgrade is applied.
The latest(new) configuration is saved to the same folder with a .rpmnew
extension.
Compare your existing configuration and apply any new properties or applicable values supplied with the new configuration files included with your latest version.
Debian based
If you’re on a Debian-based system, your current configuration will be saved with the .dpkg-old
extension and no longer used. A new version of configuration file containing any new defaults and property features will be created and used.
Compare new config and add your existing custom configuration to your new configuration before restarting services.
In most cases, it is recommended to keep your current configuration, and introduce any new properties as required. For /etc/wandisco/ui/application-prod.properties
, it is essential to keep the existing configuration to ensure the UI starts.
See Debian automatic handling of configuration files for more information.
If Hive Migrator is being used, compaction of the H2 database must be done prior to performing upgrades. More information and steps to perform the compaction can be found here.
Hotfix patch
Newer releases can include previously issued hotfixes. If it's included in your latest version, and not required, fully remove the hotfix patch from your deployment.
If you've deployed a hotfix on your current version, see Hotfix patch removal important information to confirm if it's still required for your latest upgraded version.
Upgrading to Data Migrator 3.0 and later
See the following Known issue when upgrading remote agents with JDBC credentials: JDBC password overwritten on remote agent upgrade.
See the following Known issue article showing important changes in interpretation of the underscore character in different versions of Data Migrator metadata rule patterns.
Upgrading to Data Migrator 2.5 and later
Location mapping properties
If you're upgrading to 2.5.4 and use tables in the Hive metastore, which have a path
Serde property (either created by Spark or custom Serdes)
indicating the data location, and require transforming this location to the location of your target platform data within migrations, review the location mapping properties information and contact Support so that these properties can be adjusted accordingly.
Databricks agents
If you are upgrading from any Data Migrator version prior to 2.5 and have Databricks agents. Because of the significant improvements to this agent type in 2.5, all Databricks migrations must be stopped and deleted, and any Databricks agents must be removed before upgrading to Data Migrator 2.5 and later.
Upgrading to remote agents
If your current deployment uses remote agents, you must complete additional steps before proceeding with the upgrade. See the following knowledge base article - known issue.
Update Hive Migrator database
Data Migrator includes a script that performs a safe database schema update. This script runs automatically during installations or upgrades using RPM or Debian. No additional actions are required.
Manual database upgrade
Only perform a manual database upgrade if instructed to do so by support
If the automatic database update is interrupted or fails for any reason contact support for assistance. If instructed to do so, you can manually perform the database upgrade using the following script.
Hive Migrator database upgrade script:
/opt/wandisco/hivemigrator/bin/hivemigrator-db-upgrade.sh
Running the upgrade script performs the following:
Creates a temporary directory
/opt/wandisco/hivemigrator/hvm-db-upgrade-tmp
. You can change its location.Copies the H2 database defined in
/etc/wandisco/hivemigrator/application.properties
to the temporary directory.The default entry in application.properties is:# H2 database location
hivemigrator.storagePath=/opt/wandisco/hivemigrator/hivemigrator.dbIf an old H2 driver is present:
- Detects agent databases placed in
/opt/wandisco/hivemigrator/agent/
. - Copies the agent database and runs H2 version transition for each agent database copy.
- Overwrites the existing agent database with the copy if the version transition was successful.
- Applies any missing schema updates up to version 1.14 to the main database.
- Runs H2 version transition and deletes the old H2 driver.
- Detects agent databases placed in
Applies the new schema to the database copy.
Overwrites the existing database with the copy if the schema update was successful.
Deletes the temporary directory.
Change the temporary database location
The script creates a temporary directory in the same folder as the existing database. To select a different temporary directory, use this command before running the script:
export CUSTOM_TMP_DIR="<Full-Path-To-Different-Directory>"
Obtain a new installer and upgrade Data Migrator
To upgrade to the latest version of Data Migrator, download and run a new Data Migrator installer in the same way you do to install for the first time.
Upgrading to a newer version won't affect your filesystems or migrations. Any migrations that are in progress simply continue transferring data as normal.
You can check the component versions of your current installation by running the command livedata-migrator --version
on your Data Migrator host machine.
System and custom users for upgrades
If you want to run the installer using a default user, run the following command:
./livedata-migrator.sh
The Data Migrator installer extracts its contents to a temporary directory and decompresses them. By default, the temporary directory is a sub-directory of /tmp
.
In some situations, extracting and decompressing in the default temporary directory fails. For example, if there is not enough disk space remaining, or if /tmp
is mounted as noexec
.
To avoid these issues, extract the contents to a different temporary directory by adding the --target
option when you run the installer:
./livedata-1.21.0-4-full_rpm_installer.sh --target /opt/wandisco/alternate_tmp_dir
Do not use /opt/wandisco/tmp
as the value for --target
or the installation will fail.
You can delete your temporary directory and its contents after installation.
The default system user for the Data Migrator and the UI services is hdfs
, and the default system user for the Hive Migrator service is hive
.
If you want to upgrade the product using a custom user and custom user group, run the following commands:
./livedata-migrator.sh --user <custom user> --group <custom group>
./livedata-migrator.sh -- --user <custom user> --group <custom group>
This sets the custom user and custom user group for all services and their respective directories.
For more information about configuring custom users, go to Configure system users.
If you don’t enter a custom user and group, then the pre-existing user and group are used from the following files:
/etc/wandisco/hivemigrator/vars.sh
/etc/wandisco/livedata-migrator/vars.env
/etc/wandisco/ui/vars.env
/etc/wandisco/hivemigrator/user-vars.sh
If any of these files don’t exist, the default user for that component is used instead.
Upgrade a Hive Migrator remote agent
Use the following steps to upgrade a Hive Migrator remote agent and reference this known issue:
- Run the
hive agent show
command and copy theinstallationCommand
value. - Upload the new
hivemigrator-remote-server-installer.sh
file to the remote host.noteYou can find the
hivemigrator-remote-server-installer.sh
file under/opt/wandisco/hivemigrator
. - Make the installer executable:
chmod +x hivemigrator-remote-server-installer.sh
- Run the installation command copied in step 1:Example
./hivemigrator-remote-server-installer.sh -- --silent --config 25ma-example-string-AbCdEfGhIjKADogCJpemxlbj==
- Restart the
hivemigrator-remote-server
service:systemctl restart hivemigrator-remote-server
- Check the remote agent is healthy using the
hive agent check
command.
Install components using RPM/DEB
If you're installing our product components individually using RPM/DEB, you can enter a custom user or group by adding a properties file with the custom user and group.
Example
/opt/wandisco/tmp/ldm.properties:
USERNAME="custom"
GROUPNAME="custom"
/opt/wandisco/tmp/ui.properties:
USERNAME="custom"
GROUPNAME="custom"
/opt/wandisco/tmp/hvm.properties:
HIVE_MIGRATOR_SERVER_USER="custom"
HIVE_MIGRATOR_SERVER_GROUP="custom"
When you install using RPM/DEB, the properties file containing the custom user names and group names are used, and set the user and group of the service and its respective directories.
If you upgrade a single component without using a properties file, then the RPM/DEB checks for the pre-existing user and group in /opt/wandisco/hivemigrator/vars.sh
, /opt/wandisco/livedata-migrator/vars.env
, and /opt/wandisco/ui/vars.env
. If any of these files don't exist, the installer uses the default user for that component.
This applies to the hivemigrator-remote-server
installer.
If you don't enter a custom user or group to the installer when you upgrade, the existing vars.env
/vars.sh
for each component of the product is retained, and existing property values are inserted into the new vars.env/vars.sh
provided by the component packaging.
We don't currently retain previous custom properties when you upgrade with a custom user or group.
Next steps
Continue migrating data as before. Learn how to get started.