Skip to main content
Version: 2.4.3

Configure Snowflake as a target

note

Snowflake is currently available as a preview feature and under development. If you use Snowflake as a target metastore with Data Migrator, and have feedback to share, contact us. The feature is automatically enabled. See Preview features.

Configure Snowflake as a target metastore using either the UI or the CLI.

We support the following tables for migration:

info

Disable transactional tables with the following command:

tblproperties('transactional'='false')

We support public and private Snowflake schemas.

Prerequisites

Ensure you have the following before you start:

  • A Snowflake account identifier, warehouse, and stage. See the Snowflake documentation.
  • A cloud storage service already configured as your target filesystem. See Configure target filesystems.
  • If you choose the private key method of authentication, you'll need a private key file for Snowflake key pair authentication. The file must be stored on your Data Migrator cluster host. In this case, the Hive user requires access permissions to the private key file. See Configure system users.
  • Appropriate user and role privileges to ensure Data Migrator accesses and modifies only those Snowflake target schemas required for migrations. See below for examples.

Additional information.

Examples of how to set up the Hive user with limited permissions on Snowflake

This example shows how you can grant permissions to access and modify specific databases and schemas in Snowflake.

Example 1
USE ROLE ACCOUNTADMIN;
//Skip if warehouse already exists
CREATE WAREHOUSE "HVM_WAREHOUSE";
CREATE DATABASE "HVM_DB";
CREATE SCHEMA "HVM_DB"."HVM_SCHEMA";
CREATE ROLE "HVM_INTEGRATION";
GRANT USAGE ON WAREHOUSE "HVM_WAREHOUSE" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON DATABASE "HVM_DB" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
//For internal stage
GRANT READ ON STAGE "HVM_DB"."HVM_SCHEMA".HVM_STAGE TO ROLE "HVM_INTEGRATION";
//For external stage
GRANT USAGE ON STAGE in "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE STAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE "HVM_INTEGRATION";
//There is no FUTURE grants support for databases, thus migration of databases requires ACCOUNTADMIN role
//If limited list of databases going to be migrated with HVM, databases can be created manually by admin or admin can
//grant ACCOUNTADMIN role to the HVM_USER for short period of time to migrate databases only (use particular HVM migration rule)
//Note: Changing the properties of a database, including comments, requires the OWNERSHIP privilege for the database.

//Invoke for all schemas
GRANT SELECT,INSERT,TRUNCATE ON FUTURE TABLES in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";
GRANT SELECT ON FUTURE VIEWS in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";

//Create new user with private key
CREATE USER "HVM_USER" RSA_PUBLIC_KEY="********" DEFAULT_ROLE = "HVM_INTEGRATION";
//Grant configured HVM_INTEGRATION role to the new user
GRANT ROLE "HVM_INTEGRATION" TO USER "HVM_USER";

This example shows how you can create a stored procedure and grant the LiveData Migrator user permissions. This method calls the stored procedure rather than using the CREATE DATABASE command.

Example 2
USE ROLE ACCOUNTADMIN ;
CREATE OR REPLACE PROCEDURE demo_db.public.pTestDatabaseCreation(DBNAME string)
RETURNS text
LANGUAGE javascript
COMMENT = 'Create a database without ACCOUNTADMIN privileges'
EXECUTE AS OWNER
as
$$
var stmt = snowflake.createStatement({
sqlText: "create database "+ DBNAME
})
var ret = stmt.execute();
return ret;
$$
;
GRANT USAGE ON PROCEDURE
demo_db.public.pTestDatabaseCreation(string) TO ROLE SYSADMIN;
USE ROLE SYSADMIN;
CALL demo_db.public.pTestDatabaseCreation('my_test_database');

Snowflake agent timeout recommendations

Occasionally, migrations may fail due to brief network disruption or heavy data processing. Reduce the chance of these failures by setting the following timeout properties in the hive agent add snowflake CLI command:

PropertyDescriptionRecommended
--networkTimeoutNumber of milliseconds to wait for a response when interacting with the Snowflake service before returning an error. 0 (zero) specifies that no network timeout is set. Default=600000600000
--queryTimeoutNumber of seconds to wait for a query to complete before returning an error. 0 (zero) specifies that the driver should wait indefinitely. Default=00
  • The recommended 10 minute (600000ms) networkTimeout ensures that brief disruption to the network won't cause migration failures but migrations will not hang indefinitely.

  • Leaving queryTimeout at the default 0 value ensures that long query times will never result in a timeout.

Configure Snowflake as a target with the UI

  1. From the Dashboard, select an instance under Instances.

  2. Under Filesystems & Agents, select Metastore Agents.

  3. Under Metastore Agents, select Connect to Metastore.

  4. Under Add a Metadata Agent, select the target to which you want to migrate your data from Filesystem.

  5. Under Metastore Type, select Snowflake.

  6. Enter a Display Name for the agent.

  7. Enter your User. The email address associated with the user you wish to represent LiveData Migrator for migrations to Snowflake, for example: user@domain.com.

  8. Select one of the following authentication types:

    • Basic Authentication
    note

    If you're using Basic Authentication, you’ll need to reenter the password when updating this agent.

    • Private Key If you select private key authentication, the following entries are required:

      • Path to Private Key File Enter the path to the file in which you store the private key on your Data Migrator cluster host. For example, /home/hive/snowflake_key.p8.

      • Private Key File Password Enter the password you used when you created the private key file.

        See Key Pair Authentication.

  9. Enter an Account Identifier. For example, <organization name>-<account name>. See Account Identifier and Organization Name.

  10. Enter a Warehouse Name. This is the name of the warehouse created in Snowflake. For example, <cirata_whs>. See Overview of Warehouses.

  11. Enter a Stage Name. This is the name of the stage created in Snowflake that links the filesystem attached to this agent to the Snowflake environment. For example, <cirata_stage>. See Create Stage.

  12. Enter a Stage Database. This is the Snowflake database that contains the stage. For example, <cirata_db>. See Database.

  13. Enter a Stage Schema Name. This is the schema that contains the stage. See Database and Schema.

  14. Enter a Schema Name. This is the schema into which this agent migrates tables. For example, database and table testdb.table1 once migrated display in Snowflake as "testdb".PUBLIC."table1", where PUBLIC is the default value for the schema name.

  15. The following entries are optional:

    • Enter a Role. This is the role with which this agent acts. For example, a default Snowflake role or a custom role for your organization. You can enter a custom role for the JDBC connection used by Hive Migrator.

    • Enter a Default Filesystem Override to override the default filesystem URI. We recommend this for complex use cases only.

  16. Select Save.

Next steps

  1. Create a metadata migration using the Snowflake target agent you just configured.

  2. Monitor the following from the Dashboard:

    • The progress of the migration.
    • The status of the migration.
    • The health of your agent connection. To view the connection status:
      • Select Check status from the ellipsis.
      • Select Settings
      • Select View agent.