Create a migration
Create new migrations with Data Migrator using either the UI or CLI.
Migrations transfer existing data from the defined source to a target. Data Migrator migrates any changes made to the source data while it is being migrated and ensures that the target is up to date with those changes. It does this while continuing to perform the migration.
You will typically create multiple migrations so that you can select specific content from your source filesystem by path. You can also migrate to multiple independent filesystems at the same time by defining multiple migration resources.
Do not remove or change the target filesystem path after the migration has been created, and do not write to target filesystem paths when a migration is underway.
This could interfere with Data Migrator functionality and lead to undetermined behavior.
Use different filesystem paths when writing to the target filesystem directly (and not through Data Migrator).
Create a new migration with the UI
From the Dashboard, select your Data Migrator instance under Instances.
Under Data Migrations, select Create migration.
Enter a name for the migration.
noteYour migration name can't contain the following characters:
/
,\
,%
,[
,]
,;
.Choose a Source and Target from your filesystems.
Choose the path on your source filesystem that you want to migrate.
Use the folder browser and select the path name you want to migrate. Select the grey folder next to a path name to go inside of it and view its subdirectories.
Alternatively, enter the path manually.
If you're transferring data from a local filesystem or network-Attached Storage, these sources contain a mount point that defines the root of the access that Data Migrator has on the filesystem.
noteAzure Data Lake Storage (ADLS) Gen2 has a filesystem restriction of 60 segments. If you're migrating to an ADLS Gen2 storage, your path must have fewer than 60 segments.
Migration settings
Path mappings
See path mappings that match your migration paths. If a migration path matches an existing path mapping, the path mapping automatically applies to the migration.
If you want to add path mappings to your migration but haven't created any, see Create path mappings for more information.
Assign exclusions to a new migration
Exclude specific file sizes or file names from the migration. If you want to exclude file sizes or names from your migration but haven't defined any exclusion templates yet, see Configure exclusions to learn how.
- In the new migration page, select Add new exclusion
- Select the appropriate exclusion template from the dropdown.
The exclusion appears in the list, and can be removed before the migration is started.
Target Match
- Under Target Match, select Enable Target Match to enable Target Match on this migration. See Target Match for more information.
Target Match is enabled by default for ADLS Gen2 sources.
Migration type
Select the type of migration you want to create:
- Live migration: Changes made to the source filesystem will be migrated in real time using the notification system defined for this storage.
- Recurring migration: Existing data on the source will be moved to the target. The migration scan will be repeated to discover new changes.
- One-time migration: Existing data on the source will be moved to the target, after which the migration will be complete. When complete, to detect any changes on the source, a rescan can be performed.
Skip or overwrite files
If you've already migrated some data from the same source to the same target, you can choose whether to overwrite all the content (Overwrite) or only migrate new content that isn't already there (Skip if Size Match).
Select action policy for the migration.
- Overwrite - Everything is replaced, even if the file size is identical.
- Skip if Size Match - If the file size is identical on the source and target, the file is skipped. If it’s a different size, the whole file is replaced.
The Overwrite policy isn't available when recurring migration is selected.
For ADLS and IBM Spectrum Scales sources, the Skip if Size Match option provides more efficient handling of rename operations, preventing redundant retransfers.
Migration priority
Assign a priority to your migration based on how time-critical the data transfer is.
Under Migration Priority, select High, Normal, or Low.
Higher-priority migrations are processed first. The default priority is Normal for all migrations.
See Prioritize migrations for more information on managing migration priority.
Grouping
If you use Management Groups:
- As an admin user select a management group or leave the migration unassigned.
- If you are a non-admin user, select your management group.
If you don't have any Management Groups, leave the migration unassigned.
See Migration Management for more information on migration groups.
Migration options
Select Auto-start migration to start the migration as soon as you save it. Alternatively, you can start it manually when viewing it later.
Manage a migration with the UI
You can Stop, Resume, or Reset a migration in the migration status page. Learn more.
Bulk actions
You can apply Add exclusions, Reset, Resume, Start, and Stop actions to multiple migrations at once with the UI. See Bulk actions for more details.
Create a new migration with the CLI
Migrate data from your source filesystem to a target defined using the migration
command. Migrations will transfer existing data, as well as any subsequent changes made to the source data (in its scope), while Data Migrator remains in operation.
You will typically create multiple migrations so that you can select specific content from your source filesystem by path/directory. It is also possible to migrate to multiple independent filesystems at the same time by defining multiple migration resources.
Follow the command links to learn how to set the parameters and see examples.
Create a new migration:
Apply the
--auto-start
parameter if you would like the migration to start right away. Apply the--priority
parameter to assign a priority to the migration.Assign exclusions to the migration:
If you don't have auto-start enabled, manually start the migration:
Create a one-time migration
Create a one-time migration if you do not want Data Migrator to scan for changes to your data during a migration. These migrations do not require you to have write access to the source filesystem, or operate the migration as the hdfs
user.
Create a recurring migration
Create a recurring migration if you want the migration scan to repeat to discover new changes after existing data on the source has moved to the target.
Set a running migration limit
Set a running migration limit to control how many data migrations can run simultaneously, excluding those in a live or recurring state.