Version: 4.0 (latest)

Configure source filesystems

Add source filesystems to your Data Migrator product instance to configure where to migrate data from.

note

Hadoop Distributed File System (HDFS) - Add one source filesystem only for each product.
S3 sources (IBM Cloud Object Storage, Amazon S3) - Add one or more source filesystems.

See Supported sources and targets for a list of supported Sources and their supported targets and features.

Configure source filesystems with the UI

Add a source filesystem

To add a source filesystem:

From the Dashboard, select an instance under Instances.
Under the Filesystems & Agents menu, select Filesystems.
Select Add source filesystem.
Under Source Filesystem Configuration, select your source filesystem type.

Autodiscovery.

If you have HDFS in your environment, Data Migrator automatically detects it as your source filesystem. However, if Kerberos is enabled, or if your Hadoop configuration doesn't contain the configuration file information required for Data Migrator to connect to Hadoop, configure a HDFS source with additional Kerberos configuration.

If you want to configure a new source manually, delete any existing source first, and then manually add a new source.

If you deleted the HDFS source that Data Migrator detected automatically, and you want to redetect it, go to the CLI and run the command filesystem auto-discover-source hdfs.

Filesystem type

See the following links to add your specific source filesystem with the UI.

Add an Amazon S3 source.

Add an Azure Data Lake Storage(ADLS) Gen2 source.

Add a Ceph Storage source.

Add a Google Cloud Storage source.

Add a HDFS source.

Add a IBM Cloud Object Storage source.

Add a Local Filesystem source.

Add a Mounted Network-Attached Storage(NAS) source.

Add an Ozone source.

Add a S3 source.

For information about configuring filesystem health check notifications and email alerts, see Configure email notifications with the UI.

Configure source filesystems with the CLI

Data Migrator migrates data from a single source filesystem. Data Migrator automatically detects the Hadoop Distributed File System (HDFS) it's installed on and configures it as the source filesystem. If it doesn't detect the HDFS source automatically, you can validate the source. You can override auto-discovery of any HDFS source by manually adding a source filesystem.

Use the following CLI commands to add source filesystems:

Command	Action	CLI Examples
`filesystem add s3a`	Add an S3 filesystem resource. You can choose this when using Amazon S3, Oracle, and IBM Cloud Object Storage. If you want to specify a required filesystem, use `--s3type` parameter. See s3a optional parameters.	Amazon S3, IBM COS, S3a
`filesystem add adls2 oauth`	Add an ADLS Gen 2 filesystem resource using a service principal and oauth credentials	ADLS
`filesystem add adls2 sharedKey`	Add an ADLS Gen 2 filesystem resource using access key credentials	ADLS
`filesystem add gcs`	Add an Google Cloud Storage source filesystem	GCS
`filesystem add hdfs`	Add a HDFS resource	HDFS
`filesystem add local`	Add a local or mounted NAS filesystem resource.	Local, NAS

Validate your source filesystem

Verify that the correct source filesystem is registered or delete the existing one (you define a new source in the step Add a source filesystem.

If Kerberos is enabled or your Hadoop configuration does not contain the information needed to connect to the Hadoop filesystem, use the filesystem auto-discover-source hdfs command to enter your Kerberos credentials and auto-discover your source HDFS configuration.

note

If Kerberos is disabled, and Hadoop configuration is on the host, Data Migrator will detect the source filesystem automatically on startup.

Manage your source filesystem

Manage the source filesystem with the following commands:

Command	Action
`source clear`	Delete all sources
`source delete`	Delete one source
`source show`	View the source filesystem configuration
`filesystem auto-discover-source hdfs`	Enter your Kerberos credentials to access your source HDFS configuration

note

To update existing filesystems, first stop all migrations associated with them. After saving updates to your configuration, you'll need to restart the Data Migrator service for your updates to take effect. In most supported Linux distributions, run the command service livedata-migrator restart.

Configure source filesystems with the UI​

Add a source filesystem​

Filesystem type​

Configure source filesystems with the CLI​

Validate your source filesystem​

Manage your source filesystem​

Configure source filesystems with the UI

Add a source filesystem

Filesystem type

Configure source filesystems with the CLI

Validate your source filesystem

Manage your source filesystem