dbt Projects on Snowflake¶
dbt Core (https://github.com/dbt-labs/dbt-core) is an open-source data transformation tool and framework that you can use to define, test, and deploy SQL transformations. With dbt Projects on Snowflake, you can use familiar Snowflake features to create, edit, test, run, and manage your dbt Core projects. A dbt project is a directory that contains a dbt_project.yml
file and a set of files that define dbt assets, such as models and sources.
You can use Workspaces in Snowsight to work with dbt project files and directories, and then deploy a project as a schema-level DBT PROJECT object. You can also use SQL to work with dbt project objects, use Snowflake CLI commands to integrate deployment and execution into your CI/CD workflows, use tasks to schedule and orchestrate dbt project runs, and use Snowflake monitoring tools to inspect, manage, and tune project runs.
Using workspaces for dbt Projects on Snowflake¶
Workspaces in Snowsight offer a web-based integrated development environment (IDE) for dbt projects that can connect and sync to a Git repository. Each workspace for dbt Projects on Snowflake can represent a single dbt project or multiple dbt projects, depending on how you organize your files and folders.
You can use a workspace for dbt Projects on Snowflake to visualize, test, and run dbt projects directly in Snowflake. You can also connect the workspace to a dbt project object in Snowflake, so you can create and update objects from within the workspace. Workspaces also provide a quick way to initialize (or scaffold) a new dbt project, which creates the necessary files and directories for a dbt project, including the dbt_project.yml
file.
In addition to supporting dbt projects, Workspaces provide a unified editor for you to create, organize, and manage code across multiple file types and projects within Snowflake. For more information, see Workspaces.
Understanding dbt project objects¶
A DBT PROJECT is a schema-level object that contains versioned source files for your dbt project in Snowflake. You can connect a dbt project object to a workspace, or you can create and manage the object independently of a workspace.
You can CREATE, ALTER, and DROP dbt project objects like other schema-level objects in Snowflake. dbt project objects also support role-based access control (RBAC). You can use the EXECUTE DBT PROJECT command from a Snowflake warehouse to run dbt commands like test
and run
. You can also use tasks to schedule execution of these commands.
A dbt project object is typically based on a dbt project directory that contains a dbt-project.yml
file. This is the pattern that Snowflake uses when you deploy (create) a dbt project object from within a workspace.
You can also choose to create a dbt project from a parent directory that contains multiple dbt projects in subdirectories. When you create a dbt project object in this way, you must use the PROJECT_ROOT parameter in any EXECUTE DBT PROJECT commands that you use with the project. This parameter specifies the path to the subdirectory that contains the dbt_project.yml
file of the project.
CI/CD workflows for dbt project objects¶
dbt project objects support Snowflake CLI commands that you can use to create and manage dbt projects from the command line. This is useful for integrating dbt projects into your data engineering workflows and CI/CD pipelines. For more information, see Snowflake CLI, Integrating CI/CD with Snowflake CLI, and snow dbt commands.
Versioning for dbt project objects and files¶
Snowflake maintains versions of dbt project objects and their corresponding project files. You can use this versioning to track and manage changes throughout your data development and deployment lifecycle. Snowflake identifies dbt project object versions in the dbt project stage as shown in the following example:
snow://dbt/my_db.my_schema/my_dbt_project_object/versions/version_id
version_id
can be any of the following identifiers:
VERSION$<num>
- specifies a version identifier in the formVERSION$<num>
, where<num>
is a positive integer, for example,VERSION$1
.The version number begins at
1
when you create a dbt project object and increments by one with each new version of the dbt project object.Snowflake increments the version identifier when you perform the following tasks:
Redeploy dbt project from a workspace (runs the ALTER command).
Update the project by using the ALTER DBT PROJECT command.
Run the Snow CLI
snow dbt deploy
command without the--force
option.Snowflake resets the version identifier to
1
and removes all version aliases when you run the following commands:
The CREATE DBT PROJECT command in SQL with the
OR REPLACE
option.The
snow dbt deploy
command without the--force
option in the Snowflake CLI.
LAST
- Indicates the most recent version of the dbt project object.
FIRST
- Indicates the oldest version of the dbt project object.
version_name_alias
- Indicates a custom version name alias that you have created for a specific version of the dbt project object using the ALTER DBT PROJECT command with the ADD VERSION option. A version name alias always maps to a specific version identifier, such asVERSION$3
.
Project files stored in the dbt project stage are organized by version, with each version having its own subdirectory. For example, a dbt project object named my_dbt_project_object
with a version identifier of VERSION$3
and a dbt project file named my_dbt_project.yml
can be referenced as shown in the following example:
snow://dbt/my_db.my_schema/my_dbt_project_object/versions/VERSION$3/my_dbt_project.yml
Requirements, considerations, and limitations¶
Before you use dbt Projects on Snowflake, review the requirements, considerations, and limitations in this section.
dbt projects¶
The following requirements, considerations, and limitations apply to dbt project configurations that are supported by dbt Projects on Snowflake:
Only dbt Core projects are supported. dbt Cloud projects aren’t supported. dbt Projects on Snowflake runs dbt-core version 1.9.4 and dbt-snowflake version 1.9.2. When you migrate an existing dbt project to Snowflake, your dbt versions don’t have to align with Snowflake versions.
Each dbt project folder in your Snowflake workspace must contain a
profiles.yml
file that specifies a targetwarehouse
,database
,schema
, androle
in Snowflake for the project. Thetype
must be set tosnowflake
. dbt requires anaccount
anduser
, but these can be left with an empty or arbitrary string because the dbt project runs in Snowflake under the current account and user context.A dbt project in a workspace can’t have more than 20,000 files in its folder structure. This limit includes all files in the dbt project directory and subdirectories, including the
target/dbt_packages/logs
directories, which is where log files are saved when a dbt project runs from within the workspace.
Stored procedures¶
When you use a stored procedure to call EXECUTE DBT PROJECT, use a caller’s rights stored procedure. For more information, see CREATE PROCEDURE and Creating a stored procedure.
Workspaces for dbt Projects on Snowflake¶
The following requirements, considerations, and limitations apply to workspaces for dbt Projects on Snowflake:
Personal database requirement¶
Workspaces are created within a personal database and cannot be shared with other users. Personal databases must be enabled at the account level, which requires ACCOUNTADMIN permissions. For more information, see Managing Workspaces.
Secondary roles are required per project, per session¶
To execute dbt commands from within a workspace for dbt Projects on Snowflake, you must enable secondary roles. If secondary roles are not enabled for a project and session, you are prompted to enable secondary roles from the Profile list when you first execute a command. This selection must be made for each project, and the selection does not persist beyond the current browser session.
If you have an Enterprise Edition account, session policies might be configured to disallow secondary roles for the account or for specific users. These users will not be able to execute dbt commands from within a workspace for dbt Projects on Snowflake. For more information about session policies, see Specifying secondary roles in a session policy.
Git repositories¶
The following requirements, considerations, and limitations apply when you connect a workspace for dbt Projects on Snowflake to a Git repository:
Setting up an API integration on Snowflake is required to set up a Git repository object in Snowflake. For more information, see Create an API integration for interacting with the repository API.
Git repositories that can be accessed only through PrivateLink aren’t supported.
Git repositories larger than 2GB aren’t supported.
You can’t connect to an empty Git repository. You must have at least one commit.
dbt dependencies¶
The following requirements, considerations, and limitations apply to dbt dependencies for dbt projects in dbt Projects on Snowflake:
You must execute the dbt
deps
command from within a workspace for dbt Projects on Snowflake to build and populate thedbt_packages
project folder. You can’t execute thedeps
command by using the EXECUTE DBT PROJECT SQL command or thesnow dbt deploy
CLI command. Alternatively, you can rundeps
on your local dbt project before deploying to Snowflake as a dbt project object.Specifying Git packages (https://docs.getdbt.com/docs/build/packages#git-packages) in the
packages.yml
file isn’t supported.A network rule and external access integration are required to allow Snowflake to access the repositories for the dependencies. For more information, see Create an external access integration in Snowflake for dbt dependencies. For more information about external network access limitations, see External network access limitations.
Telemetry, logging, and tracing¶
The following requirements, considerations, and limitations apply to telemetry, logging, and tracing for dbt on Snowflake:
Workspaces for dbt Projects on Snowflake don’t stream stdout dynamically, and stdout is only viewable upon command completion.
Viewing logs and tracing requires that you set the LOG_LEVEL and TRACE_LEVEL on the dbt project object. For more information, see Access control for dbt projects on Snowflake and Monitor dbt Projects on Snowflake.
By default, Snowflake collects telemetry in the default SNOWFLAKE.TELEMETRY.EVENTS table. If you have a custom event table that is set as the event table for your account, telemetry data is collected there. If you use an Enterprise Edition account, you can create an event table to collect telemetry data and associate it with the database where the dbt project object is deployed. For more information, see Event table overview.
Supported dbt commands¶
The following table shows the dbt commands that are supported in dbt Projects on Snowflake. Any dbt command (https://docs.getdbt.com/reference/dbt-commands) that isn’t listed here isn’t supported.
dbt Projects on Snowflake, supported dbt commands by execution method¶ dbt command
Workspaces
EXECUTE DBT PROJECT
snow dbt execute
(CLI)build (https://docs.getdbt.com/reference/build)
✔
✔
✔
compile (https://docs.getdbt.com/reference/commands/compile)
✔
✔
✔
deps (https://docs.getdbt.com/reference/commands/deps)
✔
❌
✔
list (https://docs.getdbt.com/reference/commands/list)
✔
✔
✔
parse (https://docs.getdbt.com/reference/commands/parse)
❌
✔
✔
run (https://docs.getdbt.com/reference/commands/run)
✔
✔
✔
run-operation (https://docs.getdbt.com/reference/run-operation)
✔
✔
✔
seed (https://docs.getdbt.com/reference/seed)
✔
✔
✔
show (https://docs.getdbt.com/reference/commands/show)
✔
✔
✔
snapshot (https://docs.getdbt.com/reference/snapshot)
✔
✔
✔
test (https://docs.getdbt.com/reference/commands/test)
✔
✔
✔