Skip to main content
Version: 3.1.1 (latest)

Configure Iceberg Hive Catalog Target

Data Migrator supports metadata migration to Iceberg Hive Catalogs. Follow the instructions to create and configure an Iceberg agent to your Iceberg Hive Catalog on watsonx.data for your migrations with either the UI or CLI.

Add an Iceberg metastore agent on UI

  1. From the Dashboard, select an instance under Instances.
  2. Under Filesystems & Agents, select Metastore Agents.
  3. Select Connect to Metastore.
  4. Select the filesystem.
  5. Select Iceberg as the Metastore Type.
  6. Enter a Display Name.
  7. Select/confirm Apache Hive as the Catalog Type.
  8. Enter the name of your Iceberg catalog under Catalog Name.
  9. Enter the local path to a hive-site.xml file containing additional Iceberg Hive configuration in the Configuration Path field. Ensure the user running Data Migrator can access this path.
  10. Enter the name used to connect to the Iceberg Hive metastore under Hive Metastore Username.
  11. Enter the URI of your Iceberg Hive metastore thrift endpoint under Metastore URI. Include the scheme, for example: thrift://<host>:<port>.
  12. Enter the location on the target storage where the Iceberg metadata, manifest and snapshot files will reside under Warehouse Directory. For example: /warehouse.
    note

    The Warehouse Directory path supplied should not reside under a migrated directory with Target Match enabled, as Target Match will attempt to match the source and target and remove the metadata files.

  13. (Optional) - Enter a filesystem URI into Default Filesystem Override to override the default filesystem URI.

Add an Iceberg metastore agent with the CLI

To add an Iceberg agent with the CLI use the hive agent add iceberg CLI command:

Iceberg agent add example
hive agent add iceberg hive --catalog-name catalog_cat1 --config-path /etc/hadoop/watsonx/ --username ibmlhadmin --metastore-uri thrift://my.thrift.host:9083 --file-system-id aws-target   --warehouse-dir / --catalog-type HIVE --name SUPERAGENT

Update an existing Iceberg metastore agent with the UI

  1. From the Dashboard, select an instance under Instances.
  2. Under Filesystems & Agents, select Metastore Agents.
  3. Select your Iceberg agent from the list.
  4. Update your desired parameters on the Metastore Connection.

Any updates to agents will require authentication input to be provided again.

Update an Iceberg metastore agent with the CLI

To update an Iceberg agent with the CLI, use the hive agent configure iceberg hive CLI command:

Example update of an existing Iceberg agent
hive agent configure iceberg hive --name ice1 --username admin2
info

An Iceberg agent health check status may report incorrectly if updated repeatedly. See the following Known issue for more information.

tip

Remember to define your target filesystem and add any accompanying data migrations for the tables and databases you need to migrate.

Additional Iceberg Hive configuration

Specify any additional configuration required to connect to your specific Watsonx.data Iceberg Hive Catalog instance using a hive-site.xml file in the Hadoop XML configuration format. Supply this configuration when adding your agent using the Configuration Path field in the UI or with the --config-path when using the CLI. See the examples below for some common types of configuration which may be required depending on your specific Watsonx.data instance.

tip

Ensure the user running Data Migrator can access the path and file specified when you supply additional configuration.

Example: Provide target metastore security credentials

The example below uses client configuration to specify the authentication mode, username and password required to connect to the target metastore. The example specifically demonstrates use of a JCEKS credential provider file used to store the security credential.

Example hive-site.xml
<configuration>
<property>
<name>hive.metastore.client.auth.mode</name>
<value>PLAIN</value>
</property>
<property>
<name>hive.metastore.client.plain.username</name>
<value>metastoreuser1</value>
</property>
<property>
<name>hadoop.security.credential.provider.path</name>
<value>localjceks://file/etc/cirata/hivemigrator/watsonx_truststore/wandisco-watsonx.jceks</value>
</property>
...
...
</configuration>

Example: SSL configuration

For example, if your Watsonx.data Hive Catalog metastore provides a certificate, provide additional configuration to your Iceberg agent to trust this certificate.

Example hive-site.xml
<configuration>
<property>
<name>hive.metastore.truststore.type</name>
<value>JKS</value>
</property>
<property>
<name>hive.metastore.truststore.path</name>
<value>file:///etc/cirata/hivemigrator/watsonx_truststore/cacerts</value>
</property>
<property>
<name>hive.metastore.truststore.password</name>
<value>changeme</value>
</property>
...
...
</configuration>

Next steps

If you have already added Metadata Rules, create a Metadata Migration. You can also add metadata rules with the hive rule add CLI command to define the scope then create a metadata migration with hive migration add.