Configure Snowflake as a target
Configure Snowflake as a target metastore using either the UI or the CLI.
We support the following tables for migration:
- Apache Orc
- Apache Parquet
- Partitioned and non-partitioned tables
- Non-transactional tables only
Disable transactional tables with the following command:
tblproperties('transactional'='false')
We support public and private Snowflake schemas.
Prerequisites
Ensure you have the following before you start:
- A Snowflake account identifier, warehouse, and stage. See the Snowflake documentation.
- A cloud storage service already configured as your target filesystem. See Configure target filesystems.
- If you choose the private key method of authentication, you'll need a private key file for Snowflake key pair authentication. The file must be stored on your Data Migrator cluster host. In this case, the Hive user requires access permissions to the private key file. See Configure system users.
- Appropriate user and role privileges to ensure Data Migrator accesses and modifies only those Snowflake target schemas required for migrations. See below for examples.
Additional information.
- See the following Snowflake Known issue.
Examples of how to set up the Hive user with limited permissions on Snowflake
This example shows how you can grant permissions to access and modify specific databases and schemas in Snowflake.
USE ROLE ACCOUNTADMIN;
//Skip if warehouse already exists
CREATE WAREHOUSE "HVM_WAREHOUSE";
CREATE DATABASE "HVM_DB";
CREATE SCHEMA "HVM_DB"."HVM_SCHEMA";
CREATE ROLE "HVM_INTEGRATION";
GRANT USAGE ON WAREHOUSE "HVM_WAREHOUSE" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON DATABASE "HVM_DB" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
//For internal stage
GRANT READ ON STAGE "HVM_DB"."HVM_SCHEMA".HVM_STAGE TO ROLE "HVM_INTEGRATION";
//For external stage
GRANT USAGE ON STAGE in "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE STAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE "HVM_INTEGRATION";
//There is no FUTURE grants support for databases, thus migration of databases requires ACCOUNTADMIN role
//If limited list of databases going to be migrated with HVM, databases can be created manually by admin or admin can
//grant ACCOUNTADMIN role to the HVM_USER for short period of time to migrate databases only (use particular HVM migration rule)
//Note: Changing the properties of a database, including comments, requires the OWNERSHIP privilege for the database.
//Invoke for all schemas
GRANT SELECT,INSERT,TRUNCATE ON FUTURE TABLES in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";
GRANT SELECT ON FUTURE VIEWS in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";
//Create new user with private key
CREATE USER "HVM_USER" RSA_PUBLIC_KEY="********" DEFAULT_ROLE = "HVM_INTEGRATION";
//Grant configured HVM_INTEGRATION role to the new user
GRANT ROLE "HVM_INTEGRATION" TO USER "HVM_USER";
This example shows how you can create a stored procedure and grant the LiveData Migrator user permissions. This method calls the stored procedure rather than using the CREATE DATABASE
command.
USE ROLE ACCOUNTADMIN ;
CREATE OR REPLACE PROCEDURE demo_db.public.pTestDatabaseCreation(DBNAME string)
RETURNS text
LANGUAGE javascript
COMMENT = 'Create a database without ACCOUNTADMIN privileges'
EXECUTE AS OWNER
as
$$
var stmt = snowflake.createStatement({
sqlText: "create database "+ DBNAME
})
var ret = stmt.execute();
return ret;
$$
;
GRANT USAGE ON PROCEDURE
demo_db.public.pTestDatabaseCreation(string) TO ROLE SYSADMIN;
USE ROLE SYSADMIN;
CALL demo_db.public.pTestDatabaseCreation('my_test_database');
Snowflake agent timeout recommendations
Occasionally, migrations may fail due to brief network disruption or heavy data processing. Reduce the chance of these failures by setting the following timeout properties in the hive agent add snowflake
CLI command:
Property | Description | Recommended |
---|---|---|
--networkTimeout | Number of milliseconds to wait for a response when interacting with the Snowflake service before returning an error. 0 (zero) specifies that no network timeout is set. Default=600000 | 600000 |
--queryTimeout | Number of seconds to wait for a query to complete before returning an error. 0 (zero) specifies that the driver should wait indefinitely. Default=0 | 0 |
The recommended 10 minute (600000ms)
networkTimeout
ensures that brief disruption to the network won't cause migration failures but migrations will not hang indefinitely.Leaving
queryTimeout
at the default0
value ensures that long query times will never result in a timeout.
- UI
- CLI
Configure Snowflake as a target with the UI
From the Dashboard, select an instance under Instances.
Under Filesystems & Agents, select Metastore Agents.
Under Metastore Agents, select Connect to Metastore.
Under Add a Metadata Agent, select the target to which you want to migrate your data from Filesystem.
Under Metastore Type, select Snowflake.
Enter a Display Name for the agent.
Enter your User. The email address associated with the user you wish to represent LiveData Migrator for migrations to Snowflake, for example:
user@domain.com
.Select one of the following authentication types:
- Basic Authentication
noteIf you're using Basic Authentication, you’ll need to reenter the password when updating this agent.
Private Key If you select private key authentication, the following entries are required:
Path to Private Key File Enter the path to the file in which you store the private key on your Data Migrator cluster host. For example,
/home/hive/snowflake_key.p8
.Private Key File Password Enter the password you used when you created the private key file.
Enter an Account Identifier. For example,
<organization name>-<account name>
. See Account Identifier and Organization Name.Enter a Warehouse Name. This is the name of the warehouse created in Snowflake. For example,
<cirata_whs>
. See Overview of Warehouses.Enter a Stage Name. This is the name of the stage created in Snowflake that links the filesystem attached to this agent to the Snowflake environment. For example,
<cirata_stage>
. See Create Stage.Enter a Stage Database. This is the Snowflake database that contains the stage. For example,
<cirata_db>
. See Database.Enter a Stage Schema Name. This is the schema that contains the stage. See Database and Schema.
Enter a Schema Name. This is the schema into which this agent migrates tables. For example, database and table
testdb.table1
once migrated display in Snowflake as"testdb".PUBLIC."table1"
, wherePUBLIC
is the default value for the schema name.The following entries are optional:
Enter a Role. This is the role with which this agent acts. For example, a default Snowflake role or a custom role for your organization. You can enter a custom role for the JDBC connection used by Hive Migrator.
Enter a Default Filesystem Override to override the default filesystem URI. We recommend this for complex use cases only.
Select Save.
Next steps
Create a metadata migration using the Snowflake target agent you just configured.
Monitor the following from the Dashboard:
- The progress of the migration.
- The status of the migration.
- The health of your agent connection.
To view the connection status:
- Select Check status from the ellipsis.
- Select Settings
- Select View agent.
Configure Snowflake as a target with the CLI
Run the following command to sign in to Data Migrator in the CLI:
livedata-migrator
Add Snowflake as a metastore agent by running one of the following commands depending on the authentication method you want to use:
- Add Snowflake agent with basic authentication
hive agent add snowflake basic
- Add Snowflake agent with private key authentication
hive agent add snowflake privateKey
Enter values for the following properties depending on which authentication method you selected in the previous step:
Basic authentication properties
Property Description --password
Your password for basic authentication. Private key authentication properties
Property Description --private-key-file
Path to your private key file for private key authentication. --private-key-file-pwd
Password that corresponds to the above private key file (if not empty).
Enter values for the following properties to complete your Snowflake configuration:
Property Description --user
Your Snowflake username. --file-system-id
The target filesystem ID. In the UI, this is called Filesystem. --account-identifier
A unique ID for your Snowflake account. See Account Identifier and Organization Name. --warehouse
The name of the warehouse created in Snowflake. A Snowflake-based cluster of compute resources. --default-fs-override
Optional. You can enter a filesystem to override the default target filesystem URI. We recommend this for complex use cases only. --name
Optional. You can enter a name for the metastore agent. If you don't enter a name, Data Migrator generates a name automatically. --stage
Temporary storage for data being migrated to Snowflake. A stage is an external stage that uses a cloud storage service from Amazon S3, Azure, or Google Cloud. --stage-database
This is the Snowflake database that contains the stage. For example, <cirata_db>
. See Database.--stage-schema
This is the schema that contains the stage. See Database and Schema. --schema
This is the schema into which this agent migrates tables. For example, database and table testdb.table1
once migrated display in Snowflake as"testdb".PUBLIC."table1"
wherePUBLIC
is the default value for the schema name.--role
Optional. You can enter a custom role for the JDBC connection used by Hive Migrator. --network-timeout
Number of milliseconds to wait for a response when interacting with the Snowflake service before returning an error. --query-timeout
Number of seconds to wait for a query to complete before returning an error. Run the following commands to check or update your configuration:
Command Description hive agent check
Check whether the Hive agent can connect to the metastore hive agent delete
Delete a Hive agent hive agent list
List all configured Hive agents hive agent show
Show the configuration for a Hive agent hive agent types
List supported Hive agent types
Next steps
Create a metadata migration with the CLI using the Snowflake target agent you just configured.