Create a metadata migration
Metadata migrations transfer existing metadata, as well as any subsequent changes made to the source metadata (in the scope of the migration), while Hive Migrator keeps working.
If you're using MariaDB or MySQL, add the JDBC driver to the classpath manually.
Ensure migrations exist for the data (databases and tables) you want to migrate, this is not optional for transactional tables which will not be populated on the target unless the migration for the data exists first.
You need both the data and associated metadata before you can successfully run queries on migrated databases.
Create a metadata migration with the UI
Before creating your metadata migration, create a metadata rule to define it's scope.
- From your Dashboard, select the instance under Instances.
- Under Migrations, select Metadata Migrations.
- Select Create metadata migration.
- Under Migration Name, enter a name for this migration.
- Under Source, select a source Metadata Agent.
- Under Target, select a target Metadata Agent.
If a Databricks Unity Catalog Metastore Agent is selected as target
Use the options under Target Agent Configuration Overrides to override your Databricks target agent configuration for this migration.
Catalog: Enter the name of your Databricks Unity Catalog.
External Location: Specify the external location by appending or adjusting the pre-populated URI.
Conversion: Select Convert to Delta format (Optional) to convert tables to Delta Lake format and configure additional options.
Select Delete after conversion (Optional) to delete raw data after it has been converted to Delta format and migrated to Databricks.
infoOnly use this option if you're performing one-time migrations for the underlying table data. The Databricks agent doesn't support continuous (live) updates of table data if the data is deleted after conversion.
Select Table Type to specify how converted tables are migrated. Choose
Managed
to convert Hive source tables to managed delta orExternal
to convert Hive source tables to external delta.If you select External, enter the full URI of the external location to store the tables converted to Delta Lake in the Converted data location field.
Example: Converted data locationabfss://file_system@account_name.dfs.core.windows.net/dir/converted_to_delta
infoSource delta tables are migrated as external tables regardless of Table Type selection.
If a Databricks Workspace Hive Metastore - Legacy Metastore Agent is selected as target
Use the options under Target Agent Configuration Overrides to override your Databricks target agent configuration for this migration.
Convert to delta format: Select to convert your tables to Delta Lake format after migrating to Databricks.
Delete after conversion: Select to delete the underlying table data and metadata from the Filesystem Mount Point location after conversion.
infoOnly use this option if you're performing one-time migrations for the underlying table data. The Databricks agent doesn't support continuous (live) updates of table data if you're converting to Delta Lake in Databricks.
Filesystem Mount Point: Enter the mount point path of your cloud storage on your DBFS (Databricks File System). The filesystem must already be mounted on DBFS. This mount point value is required for the migration process.
Example: Mounted container's path/mnt/adls2/storage_account/
In the Default Filesystem Override field, enter the DBFS table location value in the format
dbfs:<location>
. If Convert to Delta format is selected, enter the location on DBFS to store tables converted to Delta Lake. To store Delta Lake tables on cloud storage, enter the path to the mount point and the path on the cloud storage.Example: Using conversiondbfs:<converted_tables_path>
Example: Using conversion and cloud storagedbfs:<value of Filesystem Mount Point>/<converted_tables_path>
Example: Not using converstiondbfs:<value of Filesystem Mount Point>
- Under Metadata rules, select a metadata rule to define the scope of the migration.
- Under Grouping, if you don't have any Management Groups, leave the migration unassigned. If you use Management Groups: As an admin user select a management group or leave the migration unassigned. If you are a non-admin user, select your management group.
- Select Start migration automatically to start the migration automatically or leave unselected to start manually after creation.
- Select Create to create the metadata migration.
Metrics shown on the metadata migration content summary page don’t show correct results following certain metadata migration failures. See the Known issue for more information.
Create a metadata migration with the CLI
Migrate metadata from your source metastore to a target metastore using the hive migration add
CLI command.
Define the source and target using the hive agent names detailed in the Connect to metastores section, and apply the hive rule names to the migration.
Follow the command links to learn how to set the parameters and see examples.
Create a new metadata migration:
Apply the
--auto-start
parameter if you would like the migration to start right away.If you don't have auto-start enabled, manually start the migration: