Configure Iceberg Hive Catalog Target
Data Migrator supports metadata migration to Iceberg Hive Catalogs. Follow the instructions to create and configure an Iceberg agent to your Iceberg Hive Catalog on watsonx.data for your migrations with either the UI or CLI.
Add an Iceberg metastore agent on UI
- From the Dashboard, select an instance under Instances.
- Under Filesystems & Agents, select Metastore Agents.
- Select Connect to Metastore.
- Select the filesystem.
- Select Iceberg as the Metastore Type.
- Enter a Display Name.
- Select/confirm Apache Hive as the Catalog Type.
- Enter the name of your Iceberg catalog under Catalog Name.
- Enter the local path to a
hive-site.xml
file containing additional Iceberg Hive configuration in the Configuration Path field. Ensure the user running Data Migrator can access this path. - Enter the name used to connect to the Iceberg Hive metastore under Hive Metastore Username.
- Enter the URI of your Iceberg Hive metastore thrift endpoint under Metastore URI. Include the scheme, for example:
thrift://<host>:<port>
. - Enter the location on the target storage where the Iceberg metadata, manifest and snapshot files will reside under Warehouse Directory. For example:
/warehouse
.noteThe Warehouse Directory path supplied should not reside under a migrated directory with Target Match enabled, as Target Match will attempt to match the source and target and remove the metadata files.
- (Optional) - Enter a filesystem URI into Default Filesystem Override to override the default filesystem URI.
Add an Iceberg metastore agent with the CLI
To add an Iceberg agent with the CLI use the hive agent add iceberg
CLI command:
hive agent add iceberg hive --catalog-name catalog_cat1 --config-path /etc/hadoop/watsonx/ --username ibmlhadmin --metastore-uri thrift://my.thrift.host:9083 --file-system-id aws-target --warehouse-dir / --catalog-type HIVE --name SUPERAGENT
Update an existing Iceberg metastore agent with the UI
- From the Dashboard, select an instance under Instances.
- Under Filesystems & Agents, select Metastore Agents.
- Select your Iceberg agent from the list.
- Update your desired parameters on the Metastore Connection.
Any updates to agents will require authentication input to be provided again.
Update an Iceberg metastore agent with the CLI
To update an Iceberg agent with the CLI, use the hive agent configure iceberg hive
CLI command:
hive agent configure iceberg hive --name ice1 --username admin2
An Iceberg agent health check status may report incorrectly if updated repeatedly. See the following Known issue for more information.
Remember to define your target filesystem and add any accompanying data migrations for the tables and databases you need to migrate.
Additional Iceberg Hive configuration
Specify any additional configuration required to connect to your specific Watsonx.data Iceberg Hive Catalog instance using a hive-site.xml
file in the Hadoop XML configuration format.
Supply this configuration when adding your agent using the Configuration Path field in the UI or with the --config-path
when using the CLI.
See the examples below for some common types of configuration which may be required depending on your specific Watsonx.data instance.
Ensure the user running Data Migrator can access the path and file specified when you supply additional configuration.
Example: Provide target metastore security credentials
The example below uses client configuration to specify the authentication mode, username and password required to connect to the target metastore. The example specifically demonstrates use of a JCEKS credential provider file used to store the security credential.
<configuration>
<property>
<name>hive.metastore.client.auth.mode</name>
<value>PLAIN</value>
</property>
<property>
<name>hive.metastore.client.plain.username</name>
<value>metastoreuser1</value>
</property>
<property>
<name>hadoop.security.credential.provider.path</name>
<value>localjceks://file/etc/cirata/hivemigrator/watsonx_truststore/wandisco-watsonx.jceks</value>
</property>
...
...
</configuration>
Example: SSL configuration
For example, if your Watsonx.data Hive Catalog metastore provides a certificate, provide additional configuration to your Iceberg agent to trust this certificate.
<configuration>
<property>
<name>hive.metastore.truststore.type</name>
<value>JKS</value>
</property>
<property>
<name>hive.metastore.truststore.path</name>
<value>file:///etc/cirata/hivemigrator/watsonx_truststore/cacerts</value>
</property>
<property>
<name>hive.metastore.truststore.password</name>
<value>changeme</value>
</property>
...
...
</configuration>
Next steps
If you have already added Metadata Rules, create a Metadata Migration.
You can also add metadata rules with the hive rule add
CLI command to define the scope then create a metadata migration with hive migration add
.