将 Git 与 Snowflake 结合使用的示例

本主题中的示例介绍在开发 Snowflake 应用程序时如何使用远程 Git 存储库中的文件,以及如何在 Git 存储库克隆中执行 SQL 脚本。

请务必查看以下内容,其中描述了与 Git 存储库克隆进行交互的其他方式。

将 Git 存储库文件用作存储过程处理程序

After you’ve set up integration between Snowflake and your remote Git repository, you can use files from the repository as handler code in stored procedures and UDFs. Note that, as with staged handlers, you must qualify the handler function name with the name of its containing class or module.

此示例描述如何在存储过程中使用存储库中的 Python 处理程序代码。

此示例所需的代码

此示例中的处理程序依赖于使用类似于以下内容的 SQL 代码创建的数据库:

CREATE DATABASE example_db;
USE DATABASE example_db;
CREATE SCHEMA example_schema;
USE SCHEMA example_schema;

CREATE OR REPLACE TABLE employees(id NUMBER, name VARCHAR, role VARCHAR);
INSERT INTO employees (id, name, role) VALUES (1, 'Alice', 'op'), (2, 'Bob', 'dev'), (3, 'Cindy', 'dev');

The example uses the following Python handler code contained in filter.py:

from snowflake.snowpark.functions import col

def filter_by_role(session, table_name, role):
  df = session.table(table_name)
  return df.filter(col("role") == role)

提交文件并刷新 Git 存储库克隆

  1. 在 Git 客户端中,将代码添加到远程存储库。

以下示例中的代码使用 Git 命令行工具将处理程序文件添加并提交到本地存储库,然后将其推送到 Snowflake 中的 Git 存储库克隆引用的远程存储库:

$ git add python-handlers/filter.py
$ git commit -m "Adding code to filter by role"
$ git push
  1. 在 Snowflake 中,刷新 Git 存储库克隆。

    Assuming you’ve set up integration between Snowflake and your remote Git repository, resulting in a Git repository clone in Snowflake, you can refresh the Git repository clone by fetching from the remote repository.

使用 Snowflake 从远程存储库刷新类似于使用其他 Git 客户端工具,在开始工作之前从远程存储库中提取更新,以确保您拥有最新的更改。

Code in the following example executes the ALTER GIT REPOSITORY command to retrieve the latest changes from the remote repository. The code generates a full clone that includes branches, tags, and commits.

ALTER GIT REPOSITORY snowflake_extensions FETCH;

创建并执行使用 Git 存储库克隆中的文件的过程

  1. 在 Snowflake 中编写过程。

    When you write a procedure, you can reference its handler code at the code file’s location in the Git repository clone in Snowflake. For example, to refer to a file python-handlers/filter.py in the main branch of a remote repository synchronized to a Git repository clone called snowflake_extensions, you would use syntax similar to the following:

    @snowflake_extensions/branches/main/python-handlers/filter.py

    Code in the following example creates a procedure called filter_by_role, specifying handler code stored in the Git repository clone:

    CREATE OR REPLACE PROCEDURE filter_by_role(tableName VARCHAR, role VARCHAR)
      RETURNS TABLE(id NUMBER, name VARCHAR, role VARCHAR)
      LANGUAGE PYTHON
      RUNTIME_VERSION = '3.12'
      PACKAGES = ('snowflake-snowpark-python')
      IMPORTS = ('@example_db.example_schema.snowflake_extensions/branches/main/python-handlers/filter.py')
      HANDLER = 'filter.filter_by_role';
  2. 执行该过程。

以下代码会执行该过程。

CALL filter_by_role('employees', 'dev');

以下是该过程的输出示例。

---------------------
| ID | NAME  | ROLE |
---------------------
| 2  | Bob   | dev  |
---------------------
| 3  | Cindy | dev  |
---------------------

使用 Git 存储库克隆文件配置新账户

本示例说明如何在 Snowflake 中执行一个包含在 Git 存储库克隆中的 SQL 脚本。示例中的脚本会创建用户和角色。

This example uses the EXECUTE IMMEDIATE FROM command to execute the SQL statements contained in a file in the Git repository clone.

借助 EXECUTE IMMEDIATE FROM,您可以执行(从任何 Snowflake 会话)您在远程 Git 存储库中管理的脚本。例如,您可能有一个脚本,用于设置组织中的每个新 Snowflake 账户。该脚本可能包含用于创建用户、角色、对象以及授予对账户和对象的权限的语句。

  1. Create the file setup.sql with the following contents:

    CREATE ROLE analyst;
    
    CREATE USER gladys;
    
    GRANT ROLE analyst TO USER gladys;
    
    SHOW GRANTS TO USER gladys;
  2. 将 SQL 文件提交到远程 Git 存储库中。

    Use the git command-line tool to commit the file to your remote Git repository:

    git add scripts/setup.sql
    git commit -m "Adding code to set up new accounts"
    git push

    For detailed instructions, see 提交文件并刷新 Git 存储库克隆.

  3. 刷新 Git 存储库克隆。

    Refresh the Git repository clone configuration_repo:

    ALTER GIT REPOSITORY configuration_repo FETCH;

    For detailed instructions, see 提交文件并刷新 Git 存储库克隆.

  4. 在 Snowflake 中,执行 Git 存储库克隆中的文件:

    Note

    The user executing the following statement must use a role that has the required privileges to execute all statements in the file. For more information, see Access control requirements.

    EXECUTE IMMEDIATE FROM @configuration_repo/branches/main/scripts/setup.sql;

    The EXECUTE IMMEDIATE FROM commands returns the results of the last SQL statement in the file:

    +-------------------------------+---------+------------+--------------+--------------+
    | created_on                    | role    | granted_to | grantee_name | granted_by   |
    |-------------------------------+---------+------------+--------------+--------------|
    | 2023-07-24 22:07:04.354 -0700 | ANALYST | USER       | GLADYS       | ACCOUNTADMIN |
    +-------------------------------+---------+------------+--------------+--------------+