Target Match
Target Match removes files from target file systems if they don't exist at source. Read and understand the Target Match option fully to determine if it is applicable to your use case.
Run a verification report before enabling Target Match to determine extraneous files at target and if removal is required.
Use the Target Match data migration configuration option to remove files on the target file system that don't exist at the source.
With Target Match enabled, Data Migrator will migrate files from source to target by scanning both file systems, it also identifies files that do not exist at source but exist on target and will remove these files from the target during scanning.
With Target Match disabled on a 'non-live' migration, Data Migrator will scan the source file system only for files to replicate, but it will not replicate any live events as the migration is in a non-live state. Therefore, these events are ignored and result in extra files being present on the target file system that no longer exist on the source.
Live migrations: Scanning will occur when the migration is initially started, when the migration is reset, when a path is added to rescan. Scanning does not take place when a migration is stopped and resumed.
One-time migration: Scanning takes place at the start of the migration.
Recurring migration: Scanning takes place at the start of the migration, and repeatedly when the recurring interval is set.
Limitations
Target Match enabled allows Data Migrator the capability to delete extraneous files that exist on target whilst scanning. If scanning has finished on a certain region of the file system, that region will not be re-evaluated until the next scan occurs. Therefore, any extraneous files manually created on the target post scanning will not be identified or removed until the next scan occurs.
Exclusions apply to the migration of files, they don't limit or trigger removals from the target when using Target Match.
- Excluding a file from a migration that exists on source and target won't result in the file being removed from target using Target Match.
- Excluding a file from a migration that exists on target but not on source will still result in the file being removed from target using Target Match.
Internal path mappings and Target Match are incompatible and not allowed to be used in conjunction.
Enable and disable Target Match on a migration
Target Match disabled
Scans the source file system to identify files to migrate.
Data Migrator will not be aware of or take any action on files that exist on target but not at source.
Target Match enabled
Scans both source and target file systems.
Identifies and migrates files from source.
Identifies and removes any files that exist on target but don't exist at source.
Enable Target Match during the creation of your migration or when stopped and reset.
UI
Data migrations created in the UI default to Target Match disabled, unless you're adding a live migration to an ADLS Gen2 or a IBM Spectrum Scale (GPFS) live source, in which case, Target Match is enabled by default for all migrations created with this source type.
Create migration with Target Match in the UI
- Create a migration with the UI.
- Under Target Match, Select Enable Target Match to enable Target Match on this migration.
Target Match is enabled by default for live ADLS Gen2 and live IBM Spectrum Scale (GPFS) sources.
Enable Target Match on single migration reset
Enable or disable Target Match on an individual migration when stopped and reset. See Reset Migration.
Recurring migrations are unable to be reset. To enable Target Match on an existing recurring migration, stop, delete then create the migration with Target Match.
Enable Target Match on bulk migration reset
Enable Target Match on multiple migrations when stopped and reset with the Reset with Target Match bulk action.
Migrations must be in a Stopped
or Failed
state to appear in the list of migrations available to reset.
Live ADLS Gen2 and live IBM Spectrum Scale (GPFS) sources have Target Match enabled by default, the Reset with Target Match bulk action isn't available to these sources.
- On the Dashboard page, select the relevant instance from the Instances panel.
- Select Data Migrations from the Migrations menu on the left side.
- Under Bulk Action, select Reset with Target Match.
- Select all migrations you want to update.
- Select Reset.
You can't disable Target Match with a bulk action. Disable Target Match on an individual migration when stopped and reset. See Reset Migration.
If you have Target Match enabled on a live migration, a rescan will also trigger the Target Match action during scanning.
CLI
Create migration with Target Match in the CLI
Use --target-match
with the migration add
command to enable Target Match when creating a new migration.
Target Match is disabled when --target-match
is not present, with the exception of live migrations on live ADLS Gen2 and live IBM Spectrum Scale (GPFS) sources which use Target Match regardless of the presence of the --target-match
option.
Example, create migration with Target Match
The following example shows a recurring migration with Target Match enabled.
migration add --name example1 --path /data/4 --source mysource --target mytarget --scan-only --recurring-migration --recurring-period 10m --target-match
Target Match will remove files on target. The CLI will prompt you to confirm. You won't get a confirmation prompt when adding a live migration on live ADLS Gen2 or live IBM Spectrum Scale (GPFS) sources.
When adding a live migration to an ADLS Gen2 live or a live IBM Spectrum Scale (GPFS) source, Target Match is enabled by default regardless of the presence of the --target-match
option.
Enable Target Match on reset in the CLI
Use the migration reset
command with the --target-match
option and a value of either 'ENABLE' or 'DISABLE' on a stopped migration to enable or disable Target Match.
Example, enable Target Match on reset.
migration reset --migration-id mig3 --target-match ENABLE
Example, disable Target Match on reset.
migration reset --migration-id mig3 --target-match DISABLE
Check if enabled
UI
Confirm if Target Match is enabled on a migration with the UI.
- From the Dashboard, select the migration you want to check.
- Select Settings.
- Under Migration Settings, check the value of Target Match as either 'Enabled' or 'Disabled'.
CLI
Use the migration show
CLI command with the --detailed
option to show a migration configuration.
migration show --name MyMigration1 --detailed
If Target Match has been enabled, migrationScanType
will have a value of "TWO_WAY_SCAN",
..
"target": "hdfstarget",
"state": "COMPLETED",
"resumable": false,
"abortReason": null,
"migrationStrategy": "NO_EVENT_STREAM",
"migrationScanType": "TWO_WAY_SCAN",
"exclusions": [
..
Activity monitoring
View the number of removal actions by checking the status of a migration in the UI or with the CLI. Details of files and directories identified and removed using Target Match are contained in the migration-audit log for the specific migration.
Use a verification report before enabling Target Match to determine extraneous files at target and if removal is required.
The number of files shown in 'Total paths removed by Target Match' in the UI and the value of filesRemovedTargetMatchScan
from the migration stats
CLI command show the number of files removed
from the base migration path and not a total number of files removed contained in subfolders. See the following Knowledge base article to learn more.
Number of files and directories removed
UI
To view the number of files and directories removed while using Target Match for a migration in the UI:
- From the Dashboard, select an instance under Instances.
- Under Migrations, select Data Migrations.
- Under Data Migrations, select the migration you want to check.
- View the number of files and directories removed in the Total paths removed by Target Match field.
Find more information on the migration status and summary. Learn more.
CLI
Use the migration stats
CLI command to view the number of files and directories removed while using Target Match for a migration.
The value of filesRemovedTargetMatchScan
shows the number of files removed from the base migration path.
The value of "dirsRemovedTargetMatchScan"
shows the number of directories removed.
migration stats --name MyMigration1
Logging
Extraneous files and directories identified for removal on the target by Target Match are logged in the migration-audit log for the specific migration with targetOnly=true
.
The logs show the removal actions taken. When an action is taken to remove an extraneous directory, the log will reflect the removal of the directory but not the individual files contained in that directory.
..
2024-02-21 14:42:23.185: Path /STATIC/extra_dir_at_target returned from Iterator [sourceOnly=false, targetOnly=true]
2024-02-21 14:42:23.185: Path /STATIC/extra_file_at_target returned from Iterator [sourceOnly=false, targetOnly=true]
2024-02-21 14:42:23.185: Path /STATIC/static_DIR1 returned from Iterator [sourceOnly=false, targetOnly=false]
..
Files on the replication path and directories removed by Target Match are logged in the migration-audit log for the specific migration.
..
2024-02-21 14:42:24.735: Deleting Dangling Path [/STATIC/extra_dir_at_target] on target.
2024-02-21 14:42:24.735: Deleting Dangling Path [/STATIC/extra_file_at_target] on target.
..
Considerations
Disaster recovery scenarios
Target Match identifies files to remove from a target. The source and target file system selection becomes a more critical component of any migration when using Target Match. For instance, if using a migration to recover a primary file system from a target, any new, additional, or extra files on the primary file system may actually be required. In this scenario, a migration with Target Match would not be applicable.
Hive compaction
Target Match is not recommended if Hive compaction is enabled on the target file system.
Contact Support if you have any questions or concerns around the use of Target Match with your migrations.