Skip to main content
Version: 3.0 (latest)

Create a metadata migration

Metadata migrations transfer existing metadata, as well as any subsequent changes made to the source metadata (in the scope of the migration), while Hive Migrator keeps working.

caution

If you're using MariaDB or MySQL, add the JDBC driver to the classpath manually.

info

Ensure migrations exist for the data (databases and tables) you want to migrate, this is not optional for transactional tables which will not be populated on the target unless the migration for the data exists first.

You need both the data and associated metadata before you can successfully run queries on migrated databases.

Create a metadata migration with the UI

tip

Before creating your metadata migration, create a metadata rule to define it's scope.

  1. From your Dashboard, select the instance under Instances.
  2. Under Migrations, select Metadata Migrations.
  3. Select Create metadata migration.
  4. Under Migration Name, enter a name for this migration.
  5. Under Source, select a source Metadata Agent.
  6. Under Target, select a target Metadata Agent.
If a Databricks Unity Catalog Metastore Agent is selected as target

Use the options under Target Agent Configuration Overrides to override your Databricks target agent configuration for this migration.

  • Catalog: Enter the name of your Databricks Unity Catalog.

  • External Location: Specify the external location by appending or adjusting the pre-populated URI.

    info

    Ensure the external location you specify has already been created in Databricks. Learn more from Azure, AWS and GCP.

  • Conversion: Select Convert to Delta format (Optional) to convert tables to Delta Lake format and configure additional options.

    1. Select Delete after conversion (Optional) to delete raw data after it has been converted to Delta format and migrated to Databricks.

      info

      Only use this option if you're performing one-time migrations for the underlying table data. The Databricks agent doesn't support continuous (live) updates of table data if the data is deleted after conversion.

    2. Select Table Type to specify how converted tables are migrated. Choose Managed to convert Hive source tables to managed delta or External to convert Hive source tables to external delta.

      1. If you select External, enter the full URI of the external location to store the tables converted to Delta Lake in the Converted data location field.

        Example: Converted data location
        abfss://file_system@account_name.dfs.core.windows.net/dir/converted_to_delta
      info

      Source delta tables are migrated as external tables regardless of Table Type selection.

If a Databricks Workspace Hive Metastore - Legacy Metastore Agent is selected as target

Use the options under Target Agent Configuration Overrides to override your Databricks target agent configuration for this migration.

  • Convert to delta format: Select to convert your tables to Delta Lake format after migrating to Databricks.

  • Delete after conversion: Select to delete the underlying table data and metadata from the Filesystem Mount Point location after conversion.

    info

    Only use this option if you're performing one-time migrations for the underlying table data. The Databricks agent doesn't support continuous (live) updates of table data if you're converting to Delta Lake in Databricks.

  • Filesystem Mount Point: Enter the mount point path of your cloud storage on your DBFS (Databricks File System). The filesystem must already be mounted on DBFS. This mount point value is required for the migration process.

    Example: Mounted container's path
    /mnt/adls2/storage_account/
    info

    Learn more on mounting storage on Databricks for ADLS/S3/GCP filesystems.

  • In the Default Filesystem Override field, enter the DBFS table location value in the format dbfs:<location>. If Convert to Delta format is selected, enter the location on DBFS to store tables converted to Delta Lake. To store Delta Lake tables on cloud storage, enter the path to the mount point and the path on the cloud storage.

    Example: Using conversion
    dbfs:<converted_tables_path>
    Example: Using conversion and cloud storage
    dbfs:<value of Filesystem Mount Point>/<converted_tables_path>
    Example: Not using converstion
    dbfs:<value of Filesystem Mount Point>
  1. Under Metadata rules, select a metadata rule to define the scope of the migration.
  2. Under Grouping, if you don't have any Management Groups, leave the migration unassigned. If you use Management Groups: As an admin user select a management group or leave the migration unassigned. If you are a non-admin user, select your management group.
  3. Select Start migration automatically to start the migration automatically or leave unselected to start manually after creation.
  4. Select Create to create the metadata migration.
info

Metrics shown on the metadata migration content summary page don’t show correct results following certain metadata migration failures. See the Known issue for more information.

Create a metadata migration with the CLI

Migrate metadata from your source metastore to a target metastore using the hive migration add CLI command.

Define the source and target using the hive agent names detailed in the Connect to metastores section, and apply the hive rule names to the migration.

Follow the command links to learn how to set the parameters and see examples.

  1. Create a new metadata migration:

    hive migration add

    Apply the --auto-start parameter if you would like the migration to start right away.

  2. If you don't have auto-start enabled, manually start the migration:

    hive migration start