Verify metadata migrations
Metadata Verification
Discover discrepancies (inconsistencies) between source and target metadata migration paths by performing a Verification on Live, Completed or Stopped metadata migrations.
Prerequisites
Migrations must have one of the following statuses to perform verification scans:
- Live (real-time event stream notifications for changes to the source that are replicated to the target)
- Complete (one-time migration without event stream notifications)
- Stopped (a user has stopped the migration manually)
It is not recommended to run a verification when a migration is in progress as it will likely provide inaccurate or misleading results as the target is actively changing.
Configurations
The allocation of resources and concurrency settings for metadata verifications are configurable via the API. For details on how to configure metadata verifications, please visit Metadata Verification and Repairs.
Objects Verified
The following object types are verified as part of each verification:
- Database
- Table
- Partition
- Constraint
- View
- Materialized View
Verify migrations with the UI
View the verification status
From the Dashboard, select the metadata migration you want to verify.
On the Verification panel, you can see:
- View the Verification Summary Report
- Verification Status - Not Started, In Progress, Complete, Queued.
- Total Inconsistencies - Number of discrepancies between the source and target paths.
Verify a migration
Use the following options to trigger a new verification for a migration:
Select the metadata migration you want to verify.
Select Verification from the sidebar menu.
Click Start Verification button.
Cancel a verification
To cancel a verification, whether it is currently running or waiting in the queue, select Cancel Verification after the verification process has begun.
View a verification summary report
After you select Start verification, the Verification Summary Report panel is updated with information from the source and target found during the verification scan. You can view this while the verification is in progress or when it's complete or canceled. When the verification starts checking the source and target, you can compare:
- The number of objects found on the source and the target
- Total number of inconsistencies found, including object mismatches, missing or extra objects.
Verifications Report Directory
Completed verification reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.
When a metadata migration is Reset, the older migration verification reports contained within verification's report directory will not be removed.
Download a full verification report
After the verification is complete, you can view and expand reports for completed verifications under Last x Verification Reports. Select Download all files to view, share, and analyze the results of the metadata verification. Reports which have failed or been cancelled cannot be downloaded, these can be identified by noting that there is no data and time in the 'Report Generated' column. The Download all files functionality will give you a full report as a tar archive file containing the following files:
verification-discrepancy.jsonverification-full-content.jsonverification-summary.jsonverification-missing-on-source.jsonverification-missing-on-target.json
Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.
The verification-discrepancy report shows all discrepancies (inconsistencies) between these source and target metastores while the full-content report shows discrepancies and all other objects checked.
Review the full-content report
Example full-content report.
"summaryList" : [ {
"verifyResult" : "OK",
"metadataObjectName" : "database1"
}, {
"verifyResult" : "MISSING_ON_SOURCE",
"metadataObjectName" : "database1.table1"
}, {
"verifyResult" : "MISSING_ON_TARGET",
"metadataObjectName" : "database1.table2"
}, {
"verifyResult" : "PROPERTY_MISMATCH",
"metadataObjectName" : "database1.table3"
"propertyMismatches" : [ {
"fieldName" : "Owner",
"sourceValue" : "hive",
"targetValue" : "hadoop"
} ]
} ]
| scanResult | reason | Object | Explanation | Inconsistency |
|---|---|---|---|---|
| OK | OK | database1 | Object exists on source and target and is consistent | No Inconsistency |
| MISSING_ON_SOURCE | Object is missing from the source but is present on the target | database1.table1 | Object dropped from source after migration | Inconsistency |
| MISSING_ON_TARGET | Object is missing on the target but is present on the source | database1.table2 | Object added to source after migration | Inconsistency |
| Mismatch | Table owner was different | database1.table3 | Table owner changed after migration | Inconsistency |
scanResult and reason types
| scanResult | Description |
|---|---|
| OK | Object exists on both source and target metastore and is consistent |
| MISSING_ON_TARGET | Object exists on source but not on target |
| MISSING_ON_SOURCE | Object exists on target but not on source |
| MISMATCH | Object exists on both but inconsistent |
Verify migrations with the CLI
Use the following commands to manage verifications.
hive migration verification start
Start a new verification for a migration.
Example
hive migration verification start --name migration1
hive migration verification list
List summaries for all or specified metadata verifications.
Example
hive migration verification list
hive migration verification show
Show the status of a specific migration verification.
Example
hive migration verification show --verification-id migration1-1752065069284
hive migration verification stop
Stop a queued or in-progress migration verification.
Example
hive migration verification stop --verification-id migration1-1752065069284
hive migration verification report
Download a full verification report.
Example
hive migration verification report --verification-id migration1-1752065069284 --out-dir /user/exampleVerificationDirectory
Metadata Verification Repair [Preview]
This functionality allows you to repair discrepancies (inconsistencies) identified by metadata verifications.
Limitations and considerations
Please note this feature is currently in preview and should NOT be used in production as it carries out destructive actions against target Hive metastores.
Limitations include but are not limited to:
- Only one repair can be run at a time
- MISSING_ON_SOURCE inconsistency type supported only
- Repairs are supported only when both source and target are Hive agents
Prerequisites
A repair can only be carried out against a completed metadata verification. Migrations must have one of the following statuses to perform repair actions:
- Live (real-time event stream notifications for changes to the source that are replicated to the target)
- Complete (one-time migration without event stream notifications)
- Stopped (a user has stopped the migration manually)
If there are active changes being made to either source or target, running a repair when a metadata migration is live can lead to unexpected results as both the migration and the repair may try to action a specific metadata object. Additionally, we recommend that no other processes write to the target during the repair operation. For example, compaction should be turned off. The verification report generation process does not wait for the action triggered by the "repair" to complete. Therefore it is recommended to wait for the repair process to finish before running another verification check.
Enable Metadata Verification Repair
As the repair functionality is in preview, by default, it is not enabled.
To enable in Hive Migrator:
- Add the configuration
preview.feature.verification-repair=ONto/etc/wandisco/hivemigrator/application.properties - Run
systemctl restart hivemigratorto restart service for changes to take into effect
To enable in the UI:
- Add the configuration
application.hiveMigrator.verificationRepair.enabled=trueto/etc/wandisco/ui/application-prod.properties - Run
systemctl restart livedata-uito restart service for changes to take into effect
Objects Repaired
The following object types are supported:
- Table
- Partition
- Constraint
- View
- Materialized View
Inconsistencies Repaired
Currently, only the following inconsistencies can be repaired:
| scanResult | Repair Action |
|---|---|
| MISSING_ON_SOURCE | HVM will drop metadata objects that exist on the target but not on the source |
| MISSING_ON_TARGET | Currently not supported |
| MISMATCH | Currently not supported |
Usage
A metadata repair can be triggered against a completed metadata verification that has reported inconsistencies.
Once triggered, the new repair will become associated with the existing metadata verification it was run against. A metadata verification can only have one associated repair attempt. If a repair has already been performed or an attempt has been made, you must initiate a new verification before another repair can be triggered.
Since repairs carry out destructive actions, it is vital that you review the list of inconsistencies reported by a verification and approve of the objects to be dropped from the target metastore before triggering a repair. Once triggered, this action is NOT reversible.
- Trigger a metadata verification against a migration that is Live, Complete, or Stopped.
- Review the entire list of metadata objects that have been reported as inconsistent by the verification. The reports can be downloaded from the UI or found at:
/opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/verification-missing-on-source.json - Run repair job against your metadata verification
Once a repair is triggered, all of the objects identified and listed within this report will be dropped from your target metastore.
Repair Reports
Upon completion of a repair, new reports are generated. These reports are located in the directory corresponding to the verifications associated with that specific repair such as:
/opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/
The following reports are generated once a repair has been completed or cancelled:
- repair-summary.json
- repair-full-content.json
The full-content report shows all attempted repairs along with the associated repair results.
Review the full-content report
objectsRepairedList" : [ {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "OK",
"objectName" : "database.table1.partition:id2=4",
"objectType" : "PARTITION",
"timestamp" : 1761051678524,
"failure" : null
}, {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "FAILED",
"objectName" : "database.table2",
"objectType" : "TABLE",
"timestamp" : 1761051679000,
"failure" : " RangerAccessControlException: Permission denied: user=hive, access=WRITE",inode="/.../.../"
} ],
| repairAction | repairResult | Object | Failure |
|---|---|---|---|
| DROP_ON_TARGET | OK | database.table1.partition:id2=4 | Null |
| DROP_ON_TARGET | FAILED | database1.table2 | RangerAccessControlException: Permission denied: user=hive, access=WRITE", inode="/.../.../ |
Repair inconsistencies with the UI
Repair inconsistencies
- Navigate to a metadata migration that has a completed verification report indicating inconsistencies.
- Click the Repair button.
- Review the dialogue box, confirm the number of objects scheduled for repair, and then select Repair to proceed.
Cancel a repair
To cancel a repair, whether it is currently running or waiting in the queue, select Cancel Repair after the repair process has begun.
View a repair summary report
Once you select Repair, the Repair panel displays the progress and details of the ongoing repair job. You can monitor this panel while the repair is in progress, as well as after it is complete or canceled.
When the repair begins, the following information is available:
- Repair status
- Repair progress
- Repair start and complete times
- The number of inconsistencies:
- Attempted to be repaired
- Successfully repaired
- Failed to be repaired
Verifications Report Directory
Completed repair reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.
When a metadata migration is Reset, the older migration repair reports contained within the report directory will not be removed
Download a full repair report
Once the repair is finished, you can examine and expand the reports for the completed repairs in the Last x Verification Repairs section. You have the option to download either individual reports or select Download all files to get a single tar archive. The archive provides all necessary files to view and analyze the results of the metadata verification repair, specifically including:
repair-summary.jsonrepair-full-content.json
Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.
Repair inconsistencies with the CLI
Use the following commands to manage repairs.
hive migration verification repair
Start a new repair for a migration.
Example
hive migration verification repair --verification-id migration1-1760957081968
hive migration verification repair cancel
Cancel an in-progress repair job.
Example
hive migration verification repair cancel --verification-id migration1-1760957081968