Data migration and data validation

Data migration copies rows from a source system into Snowflake tables (historical loads, partitions, and optional incremental runs). Data validation compares what landed in Snowflake with the source (schema, metrics, and optional row-level checks). SnowConvert AI runs both through the same Orchestrator and Workers model, driven by the Snowflake AIM Migration Agent in Snowflake CoCo or the scai data … CLI from a migration project.

Note

Cloud data migration and validation always use Snowflake as the target. PostgreSQL (and other dialects) are supported as sources you migrate from, not as destinations you load into through these workflows.


Data migration and validation in the Snowflake AIM Migration Agent

The agent’s guided workflow covers deployment, data movement, and checks after SQL conversion:

CapabilityWhat it means
Migrate dataCopy rows from source tables into Snowflake with automatic row-count validation as part of the flow
Validate dataCompare schema, metrics, and optionally rows between source and Snowflake after migration
migrate-objectsDeploys objects wave by wave; for tables, deploys to Snowflake then migrates data from the source

Procedure-level assurance uses baseline-capture and the migrate-objects test loop (two-sided validation of function and procedure output against captured baselines). That complements table-level migration and validation above.

Example prompt (from the Snowflake AIM Migration Agent topic):

"Migrate data"
"Validate data"

For cloud table validation (schema, metrics, and optional row-level checks), ask the agent to run cloud data validation or use scai data validate start from your project directory.


Supported source systems (cloud)

Not every source supports every command. Use this table for cloud workflows (scai data migrate … and scai data validate …). Legacy commands (migrate-legacy, validate-legacy) have a narrower dialect list; see the technical pages below.

SourceCloud data migrationCloud data validationWorker connectivity highlights
SQL ServerYesYesODBC (or BCP when enabled)
Amazon RedshiftYesYesODBC; optional UNLOAD to S3
OracleYesYesOracle Instant Client + ODBC
PostgreSQLYesYesNpgsql (no ODBC); set ssl_mode in Worker TOML
TeradataYesYesteradatasql or ODBC; regular, write_nos, or tpt extraction for migration
Azure SynapsePlannedPlannedODBC (coming soon)

For the full capability matrix (code extraction, deploy, testing, and more), see Supported source systems in the Snowflake AIM Migration Agent.

Note

Cloud workflows use the Snowflake Python connector (not the Snowflake CLI). Orchestrator and Worker hosts need Python 3.11+. Running validation consumes Snowflake credits for L2/L3 warehouse queries.


Quick start by source

After scai init, scai code extract, scai code convert, and scai code deploy, use the connection and workflow commands for your dialect.

SourceRegister sourceCloud migrationCloud validation
SQL Serverscai connection add-sql-serverscai data migrate startscai data validate start
Amazon Redshiftscai connection add-redshiftscai data migrate startscai data validate start
Teradatascai connection add-teradatascai data migrate startscai data validate start
Oraclescai connection add-oraclescai data migrate startscai data validate start
PostgreSQLscai connection add-postgresqlscai data migrate startscai data validate start

Workflow configuration is YAML or JSON (for example .scai/config/data-migration-config.yaml and data-validation-config.yaml; format is determined by file extension). scai data migrate generate-config and scai data validate generate-config set source_platform from your project dialect.

Generate Worker TOML before your first run:

scai data worker generate-config .scai/settings/DataExchangeWorkerConfig.toml
scai connection test -l <dialect> -s <profile> --json

Per-source platform details (drivers, extraction strategies, Teradata complete configuration, Worker TOML examples, validation thresholds, column regex filtering, and dashboard reporting) live on the technical pages linked in the next section.


Technical documentation

Use these pages for architecture, prerequisites, scai data worker / scai data orchestrator setup, workflow reference, monitoring, and per-source Worker TOML.

On the data migration page, open Source-platform specifics and Source connection configuration examples for SQL Server, Amazon Redshift, Teradata, Oracle, and PostgreSQL tabs. The data validation page has matching tabs for validation behavior and YAML examples.