dbt Projects on Snowflake

dbt Core (https://github.com/dbt-labs/dbt-core) is an open-source data transformation tool and framework that you can use to define, test, and deploy SQL transformations. With dbt Projects on Snowflake, you can use familiar Snowflake features to create, edit, test, run, and manage your dbt Core projects. A dbt project is a directory that contains a dbt_project.yml file and a set of files that define dbt assets, such as models and sources.

You can use Workspaces in Snowsight to work with dbt project files and directories, and then deploy a project as a schema-level DBT PROJECT object. You can also use SQL to work with dbt project objects, use Snowflake CLI commands to integrate deployment and execution into your CI/CD workflows, use tasks to schedule and orchestrate dbt project runs, and use Snowflake monitoring tools to inspect, manage, and tune project runs.

Using workspaces for dbt Projects on Snowflake

Workspaces in Snowsight offer a web-based integrated development environment (IDE) for dbt projects that can connect and sync to a Git repository. Each workspace for dbt Projects on Snowflake can represent a single dbt project or multiple dbt projects, depending on how you organize your files and folders.

You can use a workspace for dbt Projects on Snowflake to visualize, test, and run dbt projects directly in Snowflake. You can also connect the workspace to a dbt project object in Snowflake, so you can create and update objects from within the workspace. Workspaces also provide a quick way to initialize (or scaffold) a new dbt project, which creates the necessary files and directories for a dbt project, including the dbt_project.yml file.

In addition to supporting dbt projects, Workspaces provide a unified editor for you to create, organize, and manage code across multiple file types and projects within Snowflake. For more information, see Workspaces.

Understanding dbt project objects

A DBT PROJECT is a schema-level object that contains versioned source files for your dbt project in Snowflake. You can connect a dbt project object to a workspace, or you can create and manage the object independently of a workspace.

You can CREATE, ALTER, and DROP dbt project objects like other schema-level objects in Snowflake. dbt project objects also support role-based access control (RBAC). You can use the EXECUTE DBT PROJECT command from a Snowflake warehouse to run dbt commands like test and run. You can also use tasks to schedule execution of these commands.

A dbt project object is typically based on a dbt project directory that contains a dbt-project.yml file. This is the pattern that Snowflake uses when you deploy (create) a dbt project object from within a workspace.

You can also choose to create a dbt project from a parent directory that contains multiple dbt projects in subdirectories. When you create a dbt project object in this way, you must use the PROJECT_ROOT parameter in any EXECUTE DBT PROJECT commands that you use with the project. This parameter specifies the path to the subdirectory that contains the dbt_project.yml file of the project.

CI/CD workflows for dbt project objects

dbt project objects support Snowflake CLI commands that you can use to create and manage dbt projects from the command line. This is useful for integrating dbt projects into your data engineering workflows and CI/CD pipelines. For more information, see Snowflake CLI, Integrating CI/CD with Snowflake CLI, and snow dbt commands.

Versioning for dbt project objects and files

Snowflake maintains versions of dbt project objects and their corresponding project files. You can use this versioning to track and manage changes throughout your data development and deployment lifecycle. Snowflake identifies dbt project object versions in the dbt project stage as shown in the following example:

snow://dbt/my_db.my_schema/my_dbt_project_object/versions/version_id

version_id can be any of the following identifiers:

  • VERSION$<num> - specifies a version identifier in the form VERSION$<num>, where <num> is a positive integer, for example, VERSION$1.

    The version number begins at 1 when you create a dbt project object and increments by one with each new version of the dbt project object.

    Snowflake increments the version identifier when you perform the following tasks:

    • Redeploy dbt project from a workspace (runs the ALTER command).

    • Update the project by using the ALTER DBT PROJECT command.

    • Run the Snow CLI snow dbt deploy command without the --force option.

    Snowflake resets the version identifier to 1 and removes all version aliases when you run the following commands:

    • The CREATE DBT PROJECT command in SQL with the OR REPLACE option.

    • The snow dbt deploy command without the --force option in the Snowflake CLI.

  • LAST - Indicates the most recent version of the dbt project object.

  • FIRST - Indicates the oldest version of the dbt project object.

  • version_name_alias - Indicates a custom version name alias that you have created for a specific version of the dbt project object using the ALTER DBT PROJECT command with the ADD VERSION option. A version name alias always maps to a specific version identifier, such as VERSION$3.

Project files stored in the dbt project stage are organized by version, with each version having its own subdirectory. For example, a dbt project object named my_dbt_project_object with a version identifier of VERSION$3 and a dbt project file named my_dbt_project.yml can be referenced as shown in the following example:

snow://dbt/my_db.my_schema/my_dbt_project_object/versions/VERSION$3/my_dbt_project.yml

Requirements, considerations, and limitations

Before you use dbt Projects on Snowflake, review the requirements, considerations, and limitations in this section.

dbt projects

The following requirements, considerations, and limitations apply to dbt project configurations that are supported by dbt Projects on Snowflake:

  • Only dbt Core projects are supported. dbt Cloud projects aren’t supported. dbt Projects on Snowflake runs dbt-core version 1.9.4 and dbt-snowflake version 1.9.2. When you migrate an existing dbt project to Snowflake, your dbt versions don’t have to align with Snowflake versions.

  • Each dbt project folder in your Snowflake workspace must contain a profiles.yml file that specifies a target warehouse, database, schema, and role in Snowflake for the project. The type must be set to snowflake. dbt requires an account and user, but these can be left with an empty or arbitrary string because the dbt project runs in Snowflake under the current account and user context.

  • A dbt project in a workspace can’t have more than 20,000 files in its folder structure. This limit includes all files in the dbt project directory and subdirectories, including the target/dbt_packages/logs directories, which is where log files are saved when a dbt project runs from within the workspace.

Stored procedures

When you use a stored procedure to call EXECUTE DBT PROJECT, use a caller’s rights stored procedure. For more information, see CREATE PROCEDURE and Creating a stored procedure.

Workspaces for dbt Projects on Snowflake

The following requirements, considerations, and limitations apply to workspaces for dbt Projects on Snowflake:

Personal database requirement

Workspaces are created within a personal database and cannot be shared with other users. Personal databases must be enabled at the account level, which requires ACCOUNTADMIN permissions. For more information, see Managing Workspaces.

Secondary roles are required per project, per session

To execute dbt commands from within a workspace for dbt Projects on Snowflake, you must enable secondary roles. If secondary roles are not enabled for a project and session, you are prompted to enable secondary roles from the Profile list when you first execute a command. This selection must be made for each project, and the selection does not persist beyond the current browser session.

If you have an Enterprise Edition account, session policies might be configured to disallow secondary roles for the account or for specific users. These users will not be able to execute dbt commands from within a workspace for dbt Projects on Snowflake. For more information about session policies, see Specifying secondary roles in a session policy.

Git repositories

The following requirements, considerations, and limitations apply when you connect a workspace for dbt Projects on Snowflake to a Git repository:

  • Setting up an API integration on Snowflake is required to set up a Git repository object in Snowflake. For more information, see Create an API integration for interacting with the repository API.

  • Git repositories that can be accessed only through PrivateLink aren’t supported.

  • Git repositories larger than 2GB aren’t supported.

  • You can’t connect to an empty Git repository. You must have at least one commit.

dbt dependencies

The following requirements, considerations, and limitations apply to dbt dependencies for dbt projects in dbt Projects on Snowflake:

  • You must execute the dbt deps command from within a workspace for dbt Projects on Snowflake to build and populate the dbt_packages project folder. You can’t execute the deps command by using the EXECUTE DBT PROJECT SQL command or the snow dbt deploy CLI command. Alternatively, you can run deps on your local dbt project before deploying to Snowflake as a dbt project object.

  • Specifying Git packages (https://docs.getdbt.com/docs/build/packages#git-packages) in the packages.yml file isn’t supported.

  • A network rule and external access integration are required to allow Snowflake to access the repositories for the dependencies. For more information, see Create an external access integration in Snowflake for dbt dependencies. For more information about external network access limitations, see External network access limitations.

Telemetry, logging, and tracing

The following requirements, considerations, and limitations apply to telemetry, logging, and tracing for dbt on Snowflake:

  • Workspaces for dbt Projects on Snowflake don’t stream stdout dynamically, and stdout is only viewable upon command completion.

  • Viewing logs and tracing requires that you set the LOG_LEVEL and TRACE_LEVEL on the dbt project object. For more information, see Access control for dbt projects on Snowflake and Monitor dbt Projects on Snowflake.

  • By default, Snowflake collects telemetry in the default SNOWFLAKE.TELEMETRY.EVENTS table. If you have a custom event table that is set as the event table for your account, telemetry data is collected there. If you use an Enterprise Edition account, you can create an event table to collect telemetry data and associate it with the database where the dbt project object is deployed. For more information, see Event table overview.

Supported dbt commands

The following table shows the dbt commands that are supported in dbt Projects on Snowflake. Any dbt command (https://docs.getdbt.com/reference/dbt-commands) that isn’t listed here isn’t supported.

dbt Projects on Snowflake, supported dbt commands by execution method

dbt command

Workspaces

EXECUTE DBT PROJECT

snow dbt execute (CLI)

build (https://docs.getdbt.com/reference/build)

compile (https://docs.getdbt.com/reference/commands/compile)

deps (https://docs.getdbt.com/reference/commands/deps)

list (https://docs.getdbt.com/reference/commands/list)

parse (https://docs.getdbt.com/reference/commands/parse)

run (https://docs.getdbt.com/reference/commands/run)

run-operation (https://docs.getdbt.com/reference/run-operation)

seed (https://docs.getdbt.com/reference/seed)

show (https://docs.getdbt.com/reference/commands/show)

snapshot (https://docs.getdbt.com/reference/snapshot)

test (https://docs.getdbt.com/reference/commands/test)

Language: English