Skip to main content
Version: 3.3 (latest)

Verify metadata migrations

Metadata Verification

Discover discrepancies (inconsistencies) between source and target metadata migration paths by performing a Verification on Live, Completed or Stopped metadata migrations.

Limitations and considerations

  • In Data Migrator 3.3+ , only one verification can be run at a time. Any additional verifications triggered will enter a global queue
  • Verifications are supported only when both source and target are Hive agents of the same version (e.g., Hive 3 -> Hive 3), applying to both local and remote agents

Prerequisites

Migrations must have one of the following statuses to perform verification scans:

  • Live (real-time event stream notifications for changes to the source that are replicated to the target)
  • Complete (one-time migration without event stream notifications)
  • Stopped (a user has stopped the migration manually)
note

It is not recommended to run a verification when a migration is in progress as it will likely provide inaccurate or misleading results as the target is actively changing.

Objects Verified

The following object types are verified as part of each verification:

  • Database
  • Table
  • Partition
  • Constraint
  • Materialized View

Verify migrations with the UI

View the migration status

  1. From the Dashboard, select the metadata migration you want to verify.

  2. On the Metadata Migration Verification panel, you can see:

    • View the Verification Summary Report
    • Verification Status - Not Started, In Progress, Complete, Queued.
    • Total Inconsistencies - Number of discrepancies between the source and target paths.

Verify a migration

Use the following options to create a new verification for a migration:

  1. Select Migration Verification from the sidebar menu.

  2. Cick Start Verification button.

Cancel a verification

You can cancel a verification that is in progress or queued. After you select Start verification, you can simply select Cancel check.

View a verification summary report

After you select Start verification, the Verification Summary Report panel is updated with information from the source and target found during the verification scan. You can view this while the verification is in progress or when it's complete or canceled. When the verification starts checking the source and target, you can compare:

  • The number of objects found on the source and the target
  • Total number of inconsistencies found, including object mismatches, missing or extra objects.
Verifications Report Directory

Completed verification reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.

caution

When a metadata migration is Reset, the older migration verification reports contained within verfiication's report directory will not be removed

Download a full verification report

After the verification is complete, you can view and expand reports for completed verifications under Last x Verification Reports. Select Download all files to view, share, and analyze the results of the metadata migration verification. This will give you a full report as a tar archive file containing summary.json and the following files:

  • verification-discrepancy.json
  • verification-full-content.json
  • verification-summary.json
  • verification-missing-on-source.json
tip

Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.

The verification-discrepancy report shows all discrepancies (inconsistencies) between these two metastores while the full-content report shows discrepancies and all checks.

To download the full verification report, under Last x Verification Reports, expand the main verification report you need, select the download all files icon. Reports which have failed or been cancelled cannot be downloaded, these can be identified by noting that there is no data and time in the 'Report Gerneated' column.

Review the full-content report

Example full-content report.

full-content.json example
 "summaryList" : [ {
"verifyResult" : "OK",
"metadataObjectName" : "database1"
}, {
"verifyResult" : "MISSING_ON_SOURCE",
"metadataObjectName" : "database1.table1"
}, {
"verifyResult" : "MISSING_ON_TARGET",
"metadataObjectName" : "database1.table2"
}, {
"verifyResult" : "PROPERTY_MISMATCH",
"metadataObjectName" : "database1.table3"
} ]

scanResultreasonObjectExplanationInconsistency
OKOKdatabase1Object exists on source and target and is consistentNo Inconsistency
MISSING_ON_SOURCEObject is missing from the source but is present on the targetdatabase1.table1Object dropped from source after migrationInconsistency
MISSING_ON_TARGETObject is missing on the target but is present on the sourcedatabase1.table2Object added to source after migrationInconsistency
MismatchTable owner was differentdatabase1.table3Table owner changed after migrationInconsistency
scanResult and reason types
scanResultDescription
OKObject exists on both source and target metastore and is consistent
MISSING_ON_TARGETObject exists on source but not on target
MISSING_ON_SOURCEObject exists on target but not on source
MISMATCHObject exists on both but inconsistent

Verify migrations with the CLI

Use the following commands to manage verifications.

hive migration verification start

Start a new verification for a migration.

Example

Trigger a new verification for a migration
hive migration verification start --name migration1

hive migration verification list

List summaries for all or specified metadata verifications.

Example

List summaries for all verifications
hive migration verification list

hive migration verification show

Show the status of a specific migration verification.

Example

Status of a completed verification
hive migration verification show --verification-id migration1-1752065069284

hive migration verification stop

Stop a queued or in-progress migration verification.

Example

Stop a migration verification
hive migration verification stop --verification-id migration1-1752065069284

hive migration verification report

Download a full verification report.

Example

Download a verification report
hive migration verification report --verification-id migration1-1752065069284 --out-dir /user/exampleVerificationDirectory

Metadata Verification Repair [Preview]

This functionality allows you to repair discrepancies (inconsistencies) identified by metadata verifications.

Limitations and considerations

caution

Please note this feature is currently in preview and should NOT be used in production as it carries out destructive actions against target Hive metastores.

Limitations include but are not limited to:

  • MISSING_ON_SOURCE inconsistency type supported only
  • Command-line interface (CLI) support only. User Interface (UI) for this feature is not supported in this release
  • Limited Error Handling and notifications
  • Only one repair can be run at a time
  • Repair must be completed before triggering a new one
  • Repairs are supported only when both source and target are Hive agents of the same version (e.g., Hive 3 -> Hive 3), applying to both local and remote agents.

Prerequisites

A repair can only be carried out against a completed metadata verification Migrations must have one of the following statuses to perform repair actions:

  • Live (real-time event stream notifications for changes to the source that are replicated to the target)
  • Complete (one-time migration without event stream notifications)
  • Stopped (a user has stopped the migration manually)
tip

If there are active changes being made to either source or target, running a repair when a metadata migration is live can lead to unexpected results as both the migration and the repair may try to action a specific metadata object. Additionally, we recommend that no other processes write to the target during the repair operation. For example, compaction should be turned off. The verification report generation process does not wait for the action triggered by the "repair" to complete. Therefore it is recommended to wait for the repair process to finish before running another verification check.

Enable Metadata Verification Repair

As the repair functionality is in preview, by default, it is not enabled. To enable:

  1. Add the configuration preview.feature.verification-repair=ON to /opt/wandisco/hivemigrator/application.properties
  2. Run systemctl restart hivemigrator to restart service for changes to take into effect

Objects Repaired

The following object types are supported

  • Table
  • Partition
  • Constraint
  • Materialized View

Inconsistencies Repaired

Currently, only the following inconsistencies can be repaired:

scanResultRepair Action
MISSING_ON_SOURCEHVM will drop metadata objects that exist on the target but not on the source
MISSING_ON_TARGETCurrently not supported
MISMATCHCurrently not supported

Usage

A metadata repair can be triggered against a completed metadata verification that has reported inconsistencies.

tip

Once triggered, the new repair will become associated with existing metadata verification. Each metadata verification can only be repaired once. If a verification has already been repaired or a repair has been attempted, you will need to trigger another verification before triggering a repair.

note

Since repairs carry out destructive actions, it is vital that you review the list of inconsistencies reported by a verification and approve of the objects to be dropped from the target metastore before triggering a repair. Once triggered, this action is NOT reversible.

  1. Trigger a metadata verification against a migration that is Live, Complete, or Stopped.
  2. Review the entire list of metadata objects that have been reported as inconsistent by the verification. The reports can be downloaded from the UI or found at: /opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/verification-missing-on-source.json
  3. Run repair job against your metadata verification
note

Once a repair is triggered, all of the objects identified and listed within this report will be dropped from your target metastore.

Repair Reports

After a repair is complete, new reports will be generated in the directory of the verifications associated with this repair such as: /opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/

The following reports are generated once a repair has been completed:

  • repair-summary.json
  • repair-full-content.json

The full-content report shows all attempted repairs along with the associated repair results.

Review the full-content report

full-content.json example
objectsRepairedList" : [ {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "OK",
"objectName" : "database.table1.partition:id2=4",
"objectType" : "PARTITION",
"timestamp" : 1761051678524,
"failure" : null
}, {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "FAILED",
"objectName" : "database.table2",
"objectType" : "TABLE",
"timestamp" : 1761051679000,
"failure" : " RangerAccessControlException: Permission denied: user=hive, access=WRITE",inode="/.../.../"
} ],
repairActionrepairResultObjectFailure
DROP_ON_TARGETOKdatabase.table1.partition:id2=4Null
DROP_ON_TARGETFAILEDdatabase1.table2RangerAccessControlException: Permission denied: user=hive, access=WRITE", inode="/.../.../

Repair inconsistencies with the CLI

Use the following commands to manage repairs.

hive migration verification repair

Start a new repair for a migration.

Example

Trigger a new repair against an existing migration
hive migration verification repair --verification-id migration1-1760957081968

hive migration verification repair cancel

Cancel an in-progress repair job.

Example

Stop a repair
hive migration verification repair cancel --verification-id migration1-1760957081968