Skip to main content
Version: 3.0 (latest)

Configure an ADLS Gen2 target

You can migrate data to an Azure Data Lake Storage (ADLS) Gen2 filesystem by configuring it as a target filesystem for Data Migrator.

You can authenticate to an ADLS Gen2 filesystem with OAuth 2.0 or a shared key. Select an option below to get started:

note

You can find more info on maximum ingress rates, capacity, request rates, as well as, scalability and performance targets for Azure Storage here.

info

The default configuration for ADLS gen2 storage allows a maximum file size of 400GB, see the following knowledge base article for steps and guidance to increase the maximum file size.

Configure an ADLS Gen2 target with OAuth 2.0

Prerequisites

You need the following:

  • A service principal with either the Storage Blob Data Owner role assigned to the ADLS Gen2 storage account, or an access control list with RWX permissions for the parent of a migration path. The required Access Control list is RWX for the parent of a migration path. If you have many migration paths, the ACLs will need to be at the parents of each path, or on one common parent. For more information, see the Microsoft documentation.
  • Your OAuth 2.0 credentials. See more information on credentials below.

Configure an ADLS Gen2 target filesystem with OAuth 2.0 in the UI

  1. From the Dashboard, select an instance under Instances.

  2. In the Filesystems & Agents menu, select Filesystems.

  3. Select Add target filesystem.

  4. Enter the following details:

    • Filesystem Type - The type of filesystem target. Select Azure Data Lake Storage (ADLS) Gen2.
    • Display Name - Enter a name for your target filesystem.
    • Data Lake Storage Endpoint - The storage endpoint to connect to. You can override the default value (dfs.core.windows.net) by replacing it with a custom or private endpoint.
    • Authentication Type - Select Service Principal (OAuth2).
    • Account Name - The name of your ADLS Gen2 storage account.
    • Container Name - The name of the container in your storage account that you want to migrate data to.
    • Client ID - The client ID (also known as application ID) for your Azure service principal. If you have configured a Vault for secrets storage, use a reference to the value stored in your secrets store.
    • Secret - The client secret (also known as application secret) for the Azure service principal. If you have configured a Vault for secrets storage, use a reference to the value stored in your secrets store.
    • OAuth2 Endpoint - The client endpoint for the Azure service principal. Use the format https://login.microsoftonline.com/<tenant>/oauth2/v2.0/token where <tenant> is the directory ID for the Azure service principal.
    • Use Secure Protocol - When enabled, Data Migrator will use TLS to connect to the Azure Data Lake Storage. Enabled by default.
  5. Select Save. You can now use your ADLS Gen2 target in data migrations.

Metadata handling properties

Add the following Data Migrator application properties to /etc/wandisco/livedata-migrator/application.properties to control ACL, permission and owner metadata operations for ADLS Gen2 targets. If not specified the default values are used. See Access control model in Azure Data Lake Storage Gen2 for more information on Azure access control.

Set these properties to true for filesystems that don't require transfer of source content ownership, ACL or permissions information, or where the authorization granted to the credentials you've used to access your storage does not allow these operations to be performed.

PropertyDefaultDescription
adls2.fs.metadata.acl.ignorefalseWhen set to true, Data Migrator will not attempt to perform any setAcls operation against an ADLS Gen2 target.
adls2.fs.metadata.perms.ignorefalseWhen set to true, Data Migrator will not attempt to perform any setPermission operation against an ADLS Gen2 target. This will also affect the health check for the target file system, which will not present an error if the principal under which Data Migrator operates is unable to perform setPermission operations, and will also allow a migration to start that would be prevented from doing so by an inability to perform setPermission operations.
adls2.fs.metadata.owner.ignorefalseWhen set to true, Data Migrator will not attempt to perform any setOwner operations against an ADLS Gen2 target.

Add ADLS Gen2 metadata handling properties

To add ADLS Gen2 target metadata handling properties to Data Migrator:

  1. Open /etc/wandisco/livedata-migrator/application.properties.

  2. Add each property and value to a new line.

    adls2.fs.metadata.acl.ignore=true
    adls2.fs.metadata.perms.ignore=true
    adls2.fs.metadata.owner.ignore=true
  3. Save the changes.

  4. Restart the Data Migrator service to apply the change. See System service commands - Data Migrator.

Next steps

If you haven't already, configure a source filesystem from which to migrate data. Then, you can create a migration to migrate data to your new ADLS Gen2 target.