Snowflake 上的 dbt 项目¶
dbt Core (https://github.com/dbt-labs/dbt-core) is an open-source data transformation tool and framework that you can use to define, test, and deploy SQL transformations.
With dbt Projects on Snowflake, you can use familiar Snowflake features to create, edit, test, run, and manage your dbt Core projects, typically as follows:
Start with a valid dbt project: (With
dbt_project.yml,profile.yml,/models/....) This is stored either in a workspace in Snowsight or a Git repository that you've connected to Snowflake. Prepare a database, schema, and warehouse with a role that has the necessary privileges.Install dependencies: Execute the
dbt depscommand within a Snowflake workspace, local machine, or git orchestrator to populate thedbt_packagesfolder for your dbt Project.For more information, see 了解 Snowflake 上 dbt 项目的依赖项.
Deploy the DBT PROJECT object: Create a schema-level DBT PROJECT object by copying your project files into a new version of that object. You can do this by using the CREATE OR REPLACE DBT PROJECT … FROM <source> command or the
snow dbt deploySnowflake CLI command.For more information, see 部署 dbt 项目对象.
Execute the dbt project in Snowflake: Execute a dbt Core project within a dbt project object by using the EXECUTE DBT PROJECT command or the
snow dbt executeSnowflake CLI command. Executing a dbt project involves invoking dbt Core commands to build or test models; this is what you schedule and orchestrate.For more information, see EXECUTE DBT PROJECT.
Schedule with Snowflake tasks: Use Snowflake tasks to schedule and orchestrate dbt project runs.
For more information, see 在 Snowflake 上计划 dbt 项目运行.
Set up CI/CD integrations: Use Snowflake CLI commands to integrate deployment and execution into your CI/CD workflows.
dbt 项目对象支持一些可用于从命令行创建和管理 dbt 项目的 Snowflake CLI 命令。这对于将 dbt 项目集成到您的数据工程工作流程和 CI/CD 管道中很有用。有关更多信息,请参阅 Snowflake CLI、将 CI/CD 集成到 Snowflake CLI 中 和 snow dbt 命令。
Monitor the dbt project: Use Snowflake monitoring features to inspect, manage, and tune dbt project execution whether you execute a dbt project object manually or use tasks to execute dbt project objects on a schedule.
For more information, see 在 Snowflake 上监控 dbt 项目.
Key concepts¶
dbt project objects: A dbt project is a directory that contains a
dbt_project.ymlfile and a set of files that define dbt assets, such as models and sources. A DBT PROJECT is a schema-level object that contains versioned source files for your dbt project in Snowflake. You can connect a dbt project object to a workspace, or you can create and manage the object independent of a workspace. You can CREATE, ALTER, and DROP dbt project objects like other schema-level objects in Snowflake.dbt 项目对象通常基于包含
dbt-project.yml文件的 dbt 项目目录。这是当您从工作区中部署(创建)dbt 项目对象时 Snowflake 使用的模式。For more information, see 了解 dbt 项目对象.
Schema customization: dbt uses the default macro
generate_schema_nameto decide where a model is built. You can customize how dbt builds your models, seeds, snapshots, and test tables.For more information, see 了解架构生成和自定义.
Workspaces: Workspaces in the Snowflake web interface are a Git-connected web IDE where you can visualize, test, run, and scaffold one or many dbt projects, link them to a Snowflake dbt project object to create/update it, and edit other Snowflake code in one place.
For more information, see 将工作区用于 dbt Projects on Snowflake.
Versioning: Every dbt project object is versioned; versions live under
snow://dbt/<db>.<schema>.<project>/versions/....For more information, see dbt 项目对象和文件的版本控制.