Skip to main content
Version: 2.6

Configure IBM Cloud Object Storage

note

If you use IBM Cloud Object Storage as a source filesystem with Data Migrator, and have feedback to share, contact us.

Configure IBM Cloud Object Storage as a source with the UI

To configure an an IBM Cloud Object Storage bucket as a source filesystem, select IBM Cloud Object Storage in the Filesystem Type dropdown menu when configuring filesystems with the UI.

Enter the following details:

  • Filesystem Type - The type of filesystem source. Choose IBM Cloud Object Storage.
  • Display Name - A name for your IBM Cloud Object Storage filesystem.
  • Access Key - The access key for your authentication credentials, associated with the fixed authentication credentials provider org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider.
note

Although IBM Cloud Object Storage can use other providers (for example, InstanceProfileCredentialsProvider, DefaultAWSCredentialsProviderChain), they're only available in the cloud, not for on-premises. As on-premises is currently the expected type of source, these other providers have not been tested and are not currently selectable.

  • Secret Key - Enter the secret key using this parameter, used for the org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider credentials provider.
  • Bucket Name - The name of your Cloud Object Store bucket.
  • Topic - The name of the Kafka topic to which the notifications will be sent.
  • Endpoint - An endpoint for a Kafka broker, in a host/port format.
  • Bootstrap Servers - A comma-separated list of host and port pairs that are addresses for Kafka brokers on a "bootstrap" Kafka cluster that Kafka clients use to bootstrap themselves.
  • Port - The TCP port used for connection to the IBM Cloud Object Storage bucket. Default is 9092.
note

Migrations from IBM Cloud Object Storage use Amazon S3, along with its filesystem classes. The main difference between IBM Cloud Object Storage and Amazon S3 is in the messaging services: SQS queue for Amazon S3, Kafka for IBM Cloud Object Storage.

Configure IBM Cloud Object Storage as a source with the CLI

Creating an IBM Cloud Object Storage source through the CLI uses the same set of command that are used for Amazon S3. The following examples clarify how the commands are used:

  • Add source IBM Cloud Object Storage filesystem. Note that this does not work if SSL is used on the endpoint address.

    filesystem add s3a --source --file-system-id cos_s3_source2
    --bucket-name container2
    --credentials-provider org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider
    --access-key pkExampleAccessKeyiz
    --secret-key c3vq6vaNtExampleSecretKeyVuqJMIHuV9IF3n9
    --s3type ibmcos
    --bootstrap.servers=10.0.0.123:9092
    --topic newcos-events--enpoint http://10.0.0.124
  • Add path mapping.

    add path mapping
    path mapping add --path-mapping-id testPath
    --description tt
    --source-path /
    --target targetHdfs2
    --target-path /repl_test1
    {
    "id": "testPath",
    "description": "tt",
    "sourceFileSystem": "cos_s3_source2",
    "sourcePath": "/",
    "targetFileSystem": "targetHdfs2",
    "targetPath": "/repl_test1"
    }
  • Adding file to container.

    ./mc cp ~/Downloads/wq4.pptx cos/container2/
  • Removing a file from a container.

    ~/Downloads/minio$ ./mc rm cos/container2/wq4.pptx
  • List objects in container.

    ./mc ls cos/container2/
  • Via S3a API.

    aws s3api list-objects --endpoint-url=http://10.0.0.201
    --bucket container2
  • config mc.

    nano ~/.mc/config.json
    add there
    "cos": {
    "url": "https://s3-cos.wandisco.com",
    "accessKey": "pkExampleAccessKeyiz",
    "secretKey": "c3vq6vaNtExampleSecretKeyVuqJMIHuV9IF3n9",
    "api": "S3v4",
    "path": "auto"
    }

Configure notifications for migrating the events stream

Migrating data from IBM Cloud Object Storage requires that filesystem events are fed into a Kafka-based notification service. Whenever an object is written, overwritten or deleted using the S3 protocol, a notification is created and stored in a Kafka topic - a message category under which Kafka publishes the notifications stream.

Configure Kafka notifications

Enter the following information into the IBM Cloud Object Storage Manager web interface.

  1. Select the Administration tab.
  2. In the Notification Service section, select Configure.
  3. On the Notification Service Configuration page, select Add Configuration.
  4. In the General section enter the following:
  • Name: A name for the configuration, for example "IBM Cloud Object Storage Notifications"
  • Topic: The name of the Kafka topic to which the notifications will be sent.
  • Hostnames: List of Kafka node endpoints (host:port) format. Note that larger clusters may support multiple nodes.
  • Type: Type of configuration.
  1. [OPTIONAL] In the Authentication section, select Enable authentication and enter your Kafka username and password.

  2. [OPTIONAL] In the Encryption section, select Enable TLS for Apache Kafka network connections.

    • If the Kafka cluster is encrypted using a self-signed TLS certificate, paste the root CA key for your Kafka configuration in the Certificate PEM field.
  3. Select Save.

  • A message appears confirming that the notification was created successfully and the configuration is listed in the Notification Service Configurations table.
  1. Select the name of the configuration (set in step 4) to assign vaults.

  2. In the Assignments section, select Change.

  3. In the Not Assigned tab, Select vaults and select Assign to Configuration. Filter available vaults by selecting or typing a name into the Vault field.

    note

    Notification configurations can't be assigned to container vaults, mirrored vaults, vault proxies, or vaults that are migrating data. Once a notification is assigned to configuration, an associated vault can't be used in a mirror, with a vault proxy, or for data migration.

    Only new operations that occur after a vault is assigned to the configuration will trigger notifications.

  4. Select update.

    note

    For more information, see the Apache Kafka documentation.