Verify metadata migrations
Metadata Verification
Discover discrepancies (inconsistencies) between source and target metadata migration paths by performing a Verification on Live, Completed or Stopped metadata migrations.
Limitations and considerations
- In Data Migrator 3.3+ , only one verification can be run at a time. Any additional verifications triggered will enter a global queue
- Verifications are supported only when both source and target are Hive agents of the same version (e.g., Hive 3 -> Hive 3), applying to both local and remote agents
Prerequisites
Migrations must have one of the following statuses to perform verification scans:
- Live (real-time event stream notifications for changes to the source that are replicated to the target)
- Complete (one-time migration without event stream notifications)
- Stopped (a user has stopped the migration manually)
It is not recommended to run a verification when a migration is in progress as it will likely provide inaccurate or misleading results as the target is actively changing.
Objects Verified
The following object types are verified as part of each verification:
- Database
- Table
- Partition
- Constraint
- Materialized View
Verify migrations with the UI
View the migration status
From the Dashboard, select the metadata migration you want to verify.
On the Metadata Migration Verification panel, you can see:
- View the Verification Summary Report
- Verification Status - Not Started, In Progress, Complete, Queued.
- Total Inconsistencies - Number of discrepancies between the source and target paths.
Verify a migration
Use the following options to create a new verification for a migration:
Select Migration Verification from the sidebar menu.
Cick Start Verification button.
Cancel a verification
You can cancel a verification that is in progress or queued. After you select Start verification, you can simply select Cancel check.
View a verification summary report
After you select Start verification, the Verification Summary Report panel is updated with information from the source and target found during the verification scan. You can view this while the verification is in progress or when it's complete or canceled. When the verification starts checking the source and target, you can compare:
- The number of objects found on the source and the target
- Total number of inconsistencies found, including object mismatches, missing or extra objects.
Verifications Report Directory
Completed verification reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.
When a metadata migration is Reset, the older migration verification reports contained within verfiication's report directory will not be removed
Download a full verification report
After the verification is complete, you can view and expand reports for completed verifications under Last x Verification Reports. Select Download all files to view, share, and analyze the results of the metadata migration verification. This will give you a full report as a tar archive file containing summary.json and the following files:
verification-discrepancy.jsonverification-full-content.jsonverification-summary.jsonverification-missing-on-source.json
Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.
The verification-discrepancy report shows all discrepancies (inconsistencies) between these two metastores while the full-content report shows discrepancies and all checks.
To download the full verification report, under Last x Verification Reports, expand the main verification report you need, select the download all files icon. Reports which have failed or been cancelled cannot be downloaded, these can be identified by noting that there is no data and time in the 'Report Gerneated' column.
Review the full-content report
Example full-content report.
"summaryList" : [ {
"verifyResult" : "OK",
"metadataObjectName" : "database1"
}, {
"verifyResult" : "MISSING_ON_SOURCE",
"metadataObjectName" : "database1.table1"
}, {
"verifyResult" : "MISSING_ON_TARGET",
"metadataObjectName" : "database1.table2"
}, {
"verifyResult" : "PROPERTY_MISMATCH",
"metadataObjectName" : "database1.table3"
} ]
| scanResult | reason | Object | Explanation | Inconsistency |
|---|---|---|---|---|
| OK | OK | database1 | Object exists on source and target and is consistent | No Inconsistency |
| MISSING_ON_SOURCE | Object is missing from the source but is present on the target | database1.table1 | Object dropped from source after migration | Inconsistency |
| MISSING_ON_TARGET | Object is missing on the target but is present on the source | database1.table2 | Object added to source after migration | Inconsistency |
| Mismatch | Table owner was different | database1.table3 | Table owner changed after migration | Inconsistency |
scanResult and reason types
| scanResult | Description |
|---|---|
| OK | Object exists on both source and target metastore and is consistent |
| MISSING_ON_TARGET | Object exists on source but not on target |
| MISSING_ON_SOURCE | Object exists on target but not on source |
| MISMATCH | Object exists on both but inconsistent |
Verify migrations with the CLI
Use the following commands to manage verifications.
hive migration verification start
Start a new verification for a migration.
Example
hive migration verification start --name migration1
hive migration verification list
List summaries for all or specified metadata verifications.
Example
hive migration verification list
hive migration verification show
Show the status of a specific migration verification.
Example
hive migration verification show --verification-id migration1-1752065069284
hive migration verification stop
Stop a queued or in-progress migration verification.
Example
hive migration verification stop --verification-id migration1-1752065069284
hive migration verification report
Download a full verification report.
Example
hive migration verification report --verification-id migration1-1752065069284 --out-dir /user/exampleVerificationDirectory
Metadata Verification Repair [Preview]
This functionality allows you to repair discrepancies (inconsistencies) identified by metadata verifications.
Limitations and considerations
Please note this feature is currently in preview and should NOT be used in production as it carries out destructive actions against target Hive metastores.
Limitations include but are not limited to:
- MISSING_ON_SOURCE inconsistency type supported only
- Command-line interface (CLI) support only. User Interface (UI) for this feature is not supported in this release
- Limited Error Handling and notifications
- Only one repair can be run at a time
- Repair must be completed before triggering a new one
- Repairs are supported only when both source and target are Hive agents of the same version (e.g., Hive 3 -> Hive 3), applying to both local and remote agents.
Prerequisites
A repair can only be carried out against a completed metadata verification Migrations must have one of the following statuses to perform repair actions:
- Live (real-time event stream notifications for changes to the source that are replicated to the target)
- Complete (one-time migration without event stream notifications)
- Stopped (a user has stopped the migration manually)
If there are active changes being made to either source or target, running a repair when a metadata migration is live can lead to unexpected results as both the migration and the repair may try to action a specific metadata object. Additionally, we recommend that no other processes write to the target during the repair operation. For example, compaction should be turned off. The verification report generation process does not wait for the action triggered by the "repair" to complete. Therefore it is recommended to wait for the repair process to finish before running another verification check.
Enable Metadata Verification Repair
As the repair functionality is in preview, by default, it is not enabled. To enable:
- Add the configuration
preview.feature.verification-repair=ONto/opt/wandisco/hivemigrator/application.properties - Run
systemctl restart hivemigratorto restart service for changes to take into effect
Objects Repaired
The following object types are supported
- Table
- Partition
- Constraint
- Materialized View
Inconsistencies Repaired
Currently, only the following inconsistencies can be repaired:
| scanResult | Repair Action |
|---|---|
| MISSING_ON_SOURCE | HVM will drop metadata objects that exist on the target but not on the source |
| MISSING_ON_TARGET | Currently not supported |
| MISMATCH | Currently not supported |
Usage
A metadata repair can be triggered against a completed metadata verification that has reported inconsistencies.
Once triggered, the new repair will become associated with existing metadata verification. Each metadata verification can only be repaired once. If a verification has already been repaired or a repair has been attempted, you will need to trigger another verification before triggering a repair.
Since repairs carry out destructive actions, it is vital that you review the list of inconsistencies reported by a verification and approve of the objects to be dropped from the target metastore before triggering a repair. Once triggered, this action is NOT reversible.
- Trigger a metadata verification against a migration that is Live, Complete, or Stopped.
- Review the entire list of metadata objects that have been reported as inconsistent by the verification. The reports can be downloaded from the UI or found at:
/opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/verification-missing-on-source.json - Run repair job against your metadata verification
Once a repair is triggered, all of the objects identified and listed within this report will be dropped from your target metastore.
Repair Reports
After a repair is complete, new reports will be generated in the directory of the verifications associated with this repair such as:
/opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/
The following reports are generated once a repair has been completed:
- repair-summary.json
- repair-full-content.json
The full-content report shows all attempted repairs along with the associated repair results.
Review the full-content report
objectsRepairedList" : [ {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "OK",
"objectName" : "database.table1.partition:id2=4",
"objectType" : "PARTITION",
"timestamp" : 1761051678524,
"failure" : null
}, {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "FAILED",
"objectName" : "database.table2",
"objectType" : "TABLE",
"timestamp" : 1761051679000,
"failure" : " RangerAccessControlException: Permission denied: user=hive, access=WRITE",inode="/.../.../"
} ],
| repairAction | repairResult | Object | Failure |
|---|---|---|---|
| DROP_ON_TARGET | OK | database.table1.partition:id2=4 | Null |
| DROP_ON_TARGET | FAILED | database1.table2 | RangerAccessControlException: Permission denied: user=hive, access=WRITE", inode="/.../.../ |
Repair inconsistencies with the CLI
Use the following commands to manage repairs.
hive migration verification repair
Start a new repair for a migration.
Example
hive migration verification repair --verification-id migration1-1760957081968
hive migration verification repair cancel
Cancel an in-progress repair job.
Example
hive migration verification repair cancel --verification-id migration1-1760957081968