Skip to main content
Version: 3.4 (latest)

Verify metadata migrations

Metadata Verification

Discover discrepancies (inconsistencies) between source and target metadata migration paths by performing a Verification on Live, Completed or Stopped metadata migrations.

Prerequisites

Migrations must have one of the following statuses to perform verification scans:

  • Live (real-time event stream notifications for changes to the source that are replicated to the target)
  • Complete (one-time migration without event stream notifications)
  • Stopped (a user has stopped the migration manually)
note

It is not recommended to run a verification when a migration is in progress as it will likely provide inaccurate or misleading results as the target is actively changing.

Configurations

The allocation of resources and concurrency settings for metadata verifications are configurable via the API. For details on how to configure metadata verifications, please visit Metadata Verification and Repairs.

Objects Verified

The following object types are verified as part of each verification:

  • Database
  • Table
  • Partition
  • Constraint
  • View
  • Materialized View

Verify migrations with the UI

View the verification status

  1. From the Dashboard, select the metadata migration you want to verify.

  2. On the Verification panel, you can see:

    • View the Verification Summary Report
    • Verification Status - Not Started, In Progress, Complete, Queued.
    • Total Inconsistencies - Number of discrepancies between the source and target paths.

Verify a migration

Use the following options to trigger a new verification for a migration:

  1. Select the metadata migration you want to verify.

  2. Select Verification from the sidebar menu.

  3. Click Start Verification button.

Cancel a verification

To cancel a verification, whether it is currently running or waiting in the queue, select Cancel Verification after the verification process has begun.

View a verification summary report

After you select Start verification, the Verification Summary Report panel is updated with information from the source and target found during the verification scan. You can view this while the verification is in progress or when it's complete or canceled. When the verification starts checking the source and target, you can compare:

  • The number of objects found on the source and the target
  • Total number of inconsistencies found, including object mismatches, missing or extra objects.
Verifications Report Directory

Completed verification reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.

caution

When a metadata migration is Reset, the older migration verification reports contained within verification's report directory will not be removed.

Download a full verification report

After the verification is complete, you can view and expand reports for completed verifications under Last x Verification Reports. Select Download all files to view, share, and analyze the results of the metadata verification. Reports which have failed or been cancelled cannot be downloaded, these can be identified by noting that there is no data and time in the 'Report Generated' column. The Download all files functionality will give you a full report as a tar archive file containing the following files:

  • verification-discrepancy.json
  • verification-full-content.json
  • verification-summary.json
  • verification-missing-on-source.json
  • verification-missing-on-target.json
tip

Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.

The verification-discrepancy report shows all discrepancies (inconsistencies) between these source and target metastores while the full-content report shows discrepancies and all other objects checked.

Review the full-content report

Example full-content report.

full-content.json example
 "summaryList" : [ {
"verifyResult" : "OK",
"metadataObjectName" : "database1"
}, {
"verifyResult" : "MISSING_ON_SOURCE",
"metadataObjectName" : "database1.table1"
}, {
"verifyResult" : "MISSING_ON_TARGET",
"metadataObjectName" : "database1.table2"
}, {
"verifyResult" : "PROPERTY_MISMATCH",
"metadataObjectName" : "database1.table3"
"propertyMismatches" : [ {
"fieldName" : "Owner",
"sourceValue" : "hive",
"targetValue" : "hadoop"
} ]
} ]
scanResultreasonObjectExplanationInconsistency
OKOKdatabase1Object exists on source and target and is consistentNo Inconsistency
MISSING_ON_SOURCEObject is missing from the source but is present on the targetdatabase1.table1Object dropped from source after migrationInconsistency
MISSING_ON_TARGETObject is missing on the target but is present on the sourcedatabase1.table2Object added to source after migrationInconsistency
MismatchTable owner was differentdatabase1.table3Table owner changed after migrationInconsistency
scanResult and reason types
scanResultDescription
OKObject exists on both source and target metastore and is consistent
MISSING_ON_TARGETObject exists on source but not on target
MISSING_ON_SOURCEObject exists on target but not on source
MISMATCHObject exists on both but inconsistent

Verify migrations with the CLI

Use the following commands to manage verifications.

hive migration verification start

Start a new verification for a migration.

Example

Trigger a new verification for a migration
hive migration verification start --name migration1

hive migration verification list

List summaries for all or specified metadata verifications.

Example

List summaries for all verifications
hive migration verification list

hive migration verification show

Show the status of a specific migration verification.

Example

Status of a completed verification
hive migration verification show --verification-id migration1-1752065069284

hive migration verification stop

Stop a queued or in-progress migration verification.

Example

Stop a migration verification
hive migration verification stop --verification-id migration1-1752065069284

hive migration verification report

Download a full verification report.

Example

Download a verification report
hive migration verification report --verification-id migration1-1752065069284 --out-dir /user/exampleVerificationDirectory

Metadata Verification Repair [Preview]

This functionality allows you to repair discrepancies (inconsistencies) identified by metadata verifications.

Limitations and considerations

caution

Please note this feature is currently in preview and should NOT be used in production as it carries out destructive actions against target Hive metastores.

Limitations include but are not limited to:

  • Only one repair can be run at a time
  • MISSING_ON_SOURCE inconsistency type supported only
  • Repairs are supported only when both source and target are Hive agents

Prerequisites

A repair can only be carried out against a completed metadata verification. Migrations must have one of the following statuses to perform repair actions:

  • Live (real-time event stream notifications for changes to the source that are replicated to the target)
  • Complete (one-time migration without event stream notifications)
  • Stopped (a user has stopped the migration manually)
tip

If there are active changes being made to either source or target, running a repair when a metadata migration is live can lead to unexpected results as both the migration and the repair may try to action a specific metadata object. Additionally, we recommend that no other processes write to the target during the repair operation. For example, compaction should be turned off. The verification report generation process does not wait for the action triggered by the "repair" to complete. Therefore it is recommended to wait for the repair process to finish before running another verification check.

Enable Metadata Verification Repair

As the repair functionality is in preview, by default, it is not enabled.

To enable in Hive Migrator:

  1. Add the configuration preview.feature.verification-repair=ON to /etc/wandisco/hivemigrator/application.properties
  2. Run systemctl restart hivemigrator to restart service for changes to take into effect

To enable in the UI:

  1. Add the configuration application.hiveMigrator.verificationRepair.enabled=true to /etc/wandisco/ui/application-prod.properties
  2. Run systemctl restart livedata-ui to restart service for changes to take into effect

Objects Repaired

The following object types are supported:

  • Table
  • Partition
  • Constraint
  • View
  • Materialized View

Inconsistencies Repaired

Currently, only the following inconsistencies can be repaired:

scanResultRepair Action
MISSING_ON_SOURCEHVM will drop metadata objects that exist on the target but not on the source
MISSING_ON_TARGETCurrently not supported
MISMATCHCurrently not supported

Usage

A metadata repair can be triggered against a completed metadata verification that has reported inconsistencies.

note

Once triggered, the new repair will become associated with the existing metadata verification it was run against. A metadata verification can only have one associated repair attempt. If a repair has already been performed or an attempt has been made, you must initiate a new verification before another repair can be triggered.

caution

Since repairs carry out destructive actions, it is vital that you review the list of inconsistencies reported by a verification and approve of the objects to be dropped from the target metastore before triggering a repair. Once triggered, this action is NOT reversible.

  1. Trigger a metadata verification against a migration that is Live, Complete, or Stopped.
  2. Review the entire list of metadata objects that have been reported as inconsistent by the verification. The reports can be downloaded from the UI or found at: /opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/verification-missing-on-source.json
  3. Run repair job against your metadata verification
note

Once a repair is triggered, all of the objects identified and listed within this report will be dropped from your target metastore.

Repair Reports

Upon completion of a repair, new reports are generated. These reports are located in the directory corresponding to the verifications associated with that specific repair such as: /opt/wandisco/hivemigrator/verifications/{migrationName}/{verificationName}/

The following reports are generated once a repair has been completed or cancelled:

  • repair-summary.json
  • repair-full-content.json

The full-content report shows all attempted repairs along with the associated repair results.

Review the full-content report

full-content.json example
objectsRepairedList" : [ {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "OK",
"objectName" : "database.table1.partition:id2=4",
"objectType" : "PARTITION",
"timestamp" : 1761051678524,
"failure" : null
}, {
"repairAction" : "DROP_ON_TARGET",
"repairResult" : "FAILED",
"objectName" : "database.table2",
"objectType" : "TABLE",
"timestamp" : 1761051679000,
"failure" : " RangerAccessControlException: Permission denied: user=hive, access=WRITE",inode="/.../.../"
} ],
repairActionrepairResultObjectFailure
DROP_ON_TARGETOKdatabase.table1.partition:id2=4Null
DROP_ON_TARGETFAILEDdatabase1.table2RangerAccessControlException: Permission denied: user=hive, access=WRITE", inode="/.../.../

Repair inconsistencies with the UI

Repair inconsistencies

  1. Navigate to a metadata migration that has a completed verification report indicating inconsistencies.
  2. Click the Repair button.
  3. Review the dialogue box, confirm the number of objects scheduled for repair, and then select Repair to proceed.

Cancel a repair

To cancel a repair, whether it is currently running or waiting in the queue, select Cancel Repair after the repair process has begun.

View a repair summary report

Once you select Repair, the Repair panel displays the progress and details of the ongoing repair job. You can monitor this panel while the repair is in progress, as well as after it is complete or canceled.

When the repair begins, the following information is available:

  • Repair status
  • Repair progress
  • Repair start and complete times
  • The number of inconsistencies:
    • Attempted to be repaired
    • Successfully repaired
    • Failed to be repaired

Verifications Report Directory

Completed repair reports for metadata can be found in /opt/wandisco/hivemigrator/verifications/.

caution

When a metadata migration is Reset, the older migration repair reports contained within the report directory will not be removed

Download a full repair report

Once the repair is finished, you can examine and expand the reports for the completed repairs in the Last x Verification Repairs section. You have the option to download either individual reports or select Download all files to get a single tar archive. The archive provides all necessary files to view and analyze the results of the metadata verification repair, specifically including:

  • repair-summary.json
  • repair-full-content.json
tip

Reports are downloaded as .gz files. Use a tool like gunzip or any compatible decompression utility to extract the file before viewing.

Repair inconsistencies with the CLI

Use the following commands to manage repairs.

hive migration verification repair

Start a new repair for a migration.

Example

Trigger a new repair against an existing migration
hive migration verification repair --verification-id migration1-1760957081968

hive migration verification repair cancel

Cancel an in-progress repair job.

Example

Stop a repair
hive migration verification repair cancel --verification-id migration1-1760957081968