Note
The Snowflake Connector for PostgreSQL and Snowflake Connector for MySQL are subject to the Connector Terms.
Tutorial: Get started with the MySQL and PostgreSQL connectors for Snowflake¶
Introduction¶
Welcome to our tutorial on using the Snowflake Database Connectors. This guide will help you seamlessly transfer data from relational databases into Snowflake.
In this tutorial, you’ll gain the skills to:
- Set up MySQL and PostgreSQL in Docker, complete with sample data for ingestion.
- Install and configure two native applications, one for each database.
- Set up and fine-tune two agents, again one for each database.
- Initiate and manage data ingestion processes.
- Monitor the data ingestion workflow.
Let’s get started!
Prerequisites¶
Before beginning this tutorial, ensure you meet the following requirements:
- Docker is installed and operational on your local machine.
- You have a tool available for connecting to the database. This can be a database-specific tool or a general-purpose tool such as IntelliJ or Visual Studio Code.
Creating MySQL and PostgreSQL Source Databases¶
In this section, we will guide you through the following steps:
- Starting the Database Instances - Learn how to launch your MySQL and PostgreSQL instances using Docker.
- Connecting to the Database - Instructions on how to establish a connection to your databases.
- Loading Sample Data - A walkthrough on how to populate your databases with sample data.
Starting the database instances¶
To begin the MySQL and PostgreSQL database configuration process using Docker, create the file docker-compose.yaml.
The content of the file should resemble:
Once your docker-compose.yaml is ready, follow these steps:
- Open a terminal.
- Navigate to the directory containing the
docker-compose.yamlfile. - Execute the following command to start source databases in containers:
After running this command, you should see two containers actively running the source databases.
Connecting to the Database¶
To connect to the pre-configured databases using IntelliJ’s or Visual Studio Code database connections, perform the following steps with the provided credentials:
-
Open your tool of choice for connecting to the MySQL.
-
Click the ‘+’ sign or similar to add data source.
-
Fill in the connection details:
- User:
root - Password:
mysql - URL:
jdbc:mysql://localhost:3306
- User:
-
Test the connection and save.
-
Open your tool of choice for connecting to the PostgreSQL.
-
Click the ‘+’ sign or similar to add data source.
-
Fill in the connection details:
- User:
postgres - Password:
postgres - Database:
postgres - URL:
jdbc:postgresql://localhost:5432
- User:
-
Test the connection and save.
Loading Sample Data¶
To initialize and load sample please execute those scripts in those connections.
Execute the script to generate sample data
Execute the script to generate sample data
You should see three rows in each populated database.
Install and configure the Native App¶
During this step you will:
Install the Native Applications¶
Follow these steps to install the Application from the Snowflake Native Apps Marketplace:
- Sign in to Snowsight.
- In the navigation menu, select Marketplace » Snowflake Marketplace.
- Install the Snowflake Connector for MySQL and Snowflake Connector for PostgreSQL applications.
- Install both applications.
After installation, you will see the new applications listed in Catalog » Apps.
Configuring the Native Applications¶
- Sign in to Snowsight.
- In the navigation menu, select Catalog » Apps.
- Open each application and do the following:
-
Select Download Driver and save the file. The file name will resemble
mariadb-java-client-3.4.1.jaror with newer version when available. Save this file for use during agent configuration. -
Select Mark all as done as we will create and populate source databases from scratch.
Note
No addition additional network configuration is required at this point as we’ll configure the agent later in the tutorial.
-
Click Start configuration.
-
On the Configure Connector screen, select Configure. The Verify Agent Connection page will display.
-
Select Generate file to generate an agent configuration file. The file name should resemble
snowflake.json. Save this file for later use in the Agent Configuration section.
-
Select Mark all as done as we will create and populate source databases from scratch.
Note
No addition additional network configuration is required at this point as we’ll configure the agent later in the tutorial.
-
Click Start configuration
-
On the Configure Connector screen, select Configure.
-
In the Verify Agent Connection page select Generate file to generate the agent configuration file. The file name should resemble
snowflake.json. Save this file for use during the Agent Configuration section.
Configure the agents¶
In this section, we’ll configure the agent that will operate with your source databases.
The first step is to create directories agent-mysql and agent-postgresql.
Within each directory, create subdirectories agent-keys and configuration. Your directory structure should resemble:
Creating configuration files¶
In this section, we’ll add content to the configuration files for each agent to operate correctly. The configuration files include:
snowflake.jsonfile to connect to the Snowflake.datasources.jsonfile to connect to the source databases.postgresql.conf/mysql.conffiles with additional agent environment variables.- JDBC Driver file for MySQL agent.
-
In a terminal, navigate to the
agent-mysqldirectory. -
Create the Docker Compose file
docker-compose.yamlwith the following content: -
Move the previously downloaded
snowflake.jsonfile into theconfigurationdirectory. -
Move the previously downloaded
mariadb-java-client-3.4.1.jarfile into theconfigurationdirectory. -
In the
configurationdirectory createdatasources.jsonwith content: -
In the
configurationdirectory createmysql.confwith content: -
Start the agent using the following command. There shouldn’t be any error message and the agent should generate a public and private key pair for authentication to Snowflake.
-
Please note that the driver jar file name should be identical to the one downloaded and used in the
docker-compose.yamlandmysql.conffiles.
-
On the command line, navigate to the
agent-postgresqldirectory. -
Create the Docker Compose file
docker-compose.yamlwith the following content: -
Move the previously downloaded
snowflake.jsonfile into theconfigurationdirectory. -
In the
configurationdirectory createdatasources.jsonwith content: -
In the
configurationdirectory, createpostgresql.confwith the following content: -
Start the agent using the following command. There shouldn’t be any error message and the agent should generate a public and private key pair for authentication to Snowflake.
When complete, your directory structure should resemble the following. Please note the inclusion of the automatically generated private and public keys within the agent-keys directories.
Verifying connection with Snowflake¶
Go back to your previously created native apps. Click on the Refresh button in the Agent Connection section.
When successfully Configured you should see:
Configure and monitor the data ingestion process¶
In this step, we will instruct the Connector to begin replicating the selected tables. First, let’s create a shared sink database in Snowflake.
Once the database is ready, we can move on to the configuration process.
-
To begin table replication, you must first add a datasource from which to replicate and then specify the table to be replicated.
-
To monitor the replication, execute the following queries:
-
To begin table replication, you must first add a data source from which to replicate and then specify the table to be replicated.
-
To monitor the replication you can execute the following queries
Understanding connector status¶
The REPLICATION_STATE view is crucial for monitoring the status of table replication. This process encompasses three distinct phases:
SCHEMA_INTROSPECTION: Ensures that the schema of the source table is accurately replicated.INITIAL_LOAD: Transfers the existing data from the source table to the destination.INCREMENTAL_LOAD: Continuously replicates ongoing changes from the source.
Upon successful replication, the status display will resemble the following:
REPLICATION_PHASE SCHEMA_INTROSPECTION_STATUS SNAPSHOT_REPLICATION_STATUS INCREMENTAL_REPLICATION_STATUS INCREMENTAL_LOAD DONE DONE IN PROGRESS
You can read more about it in the official Connector Documentation.
View data¶
Execute the following commands to view data, which should include roughly 3 rows per database.
Clean up and additional resources¶
Congratulations! You have successfully completed this tutorial.
To clean up your environment, execute the commands listed below. Failing to do so will leave the connector running and generating costs.
Remove the native app¶
Remove warehouses, roles and users¶
During the installation multiple warehouses, roles and users were created. Execute the following queries to drop those objects.
Stop database containers¶
To stop the running containers with MySQL and PostgreSQL, navigate to the directory containing the docker-compose.yaml files, then execute the docker compose down -v.
Additional resources¶
Continue learning about connectors using the following resources: