教程 1:创建数据库、架构、表和仓库

简介

In this tutorial, you learn the fundamentals for managing Snowflake resource objects using the Snowflake Python APIs. To get started with the API, you create and manage a Snowflake database, schema, table, and virtual warehouse.

先决条件

Note

If you have already completed the steps in Common setup for Snowflake Python APIs tutorials, you can skip these prerequisites and proceed to the first step of this tutorial.

Before you start this tutorial, you must complete the common setup instructions, which includes the following steps:

  • 设置开发环境。
  • Install the Snowflake Python APIs package.
  • 配置 Snowflake 连接。
  • 导入 Python API 教程所需的所有模块。
  • Create an API Root object.

完成这些先决条件后,您就可以开始使用 API 了。

创建数据库、架构和表

You can use your root object to create a database, schema, and table in your Snowflake account.

  1. 要创建一个数据库,请在笔记本的下一个单元格中运行以下代码:

    database = root.databases.create(
      Database(
     name="PYTHON_API_DB"),
     mode=CreateMode.or_replace
      )

    This code, which is functionally equivalent to the SQL command CREATE OR REPLACE DATABASE PYTHON_API_DB, creates a database in your account named PYTHON_API_DB. This code follows a common pattern for managing objects in Snowflake:

    • root.databases.create() creates a database in Snowflake. It accepts two arguments: a Database object and a mode.

    • You pass a Database object by using Database(name="PYTHON_API_DB"), and set the name of the database by using the name argument. Recall that you imported Database on line 3 of the notebook.

    • You specify the creation mode by passing the mode argument. In this case, you set it to CreateMode.or_replace, but the following values are also valid:

      • CreateMode.if_not_exists: Functionally equivalent to CREATE IF NOT EXISTS in SQL.
      • CreateMode.error_if_exists: Raises an exception if the object already exists in Snowflake. This is the default value if a mode is not specified.
    • You manage the database programmatically by storing a reference to the database in an object you created named database.

    For more information, see Managing Snowflake databases, schemas, tables, and views with Python.

  2. Sign in to Snowsight.

  3. In the navigation menu, select Catalog » Database Explorer. If your code was successful, the PYTHON_API_DB database is listed.

    Tip

    If you use VS Code, install the Snowflake extension (https://marketplace.visualstudio.com/items?itemName=snowflake.snowflake-vsc) to explore all Snowflake objects within your editor.

  4. To create a schema in the PYTHON_API_DB database, in your next cell, run the following code:

    schema = database.schemas.create(
      Schema(
     name="PYTHON_API_SCHEMA"),
     mode=CreateMode.or_replace,
      )

    Note that you call .schemas.create() on the database object you created previously.

  5. To create a table in the schema you just created, in your next cell, run the following code:

    table = schema.tables.create(
      Table(
     name="PYTHON_API_TABLE",
     columns=[
       TableColumn(
         name="TEMPERATURE",
         datatype="int",
         nullable=False,
       ),
       TableColumn(
         name="LOCATION",
         datatype="string",
       ),
     ],
      ),
    mode=CreateMode.or_replace
    )

    This code creates a table in the PYTHON_API_SCHEMA schema with two columns and their data types specified: TEMPERATURE as int, and LOCATION as string.

    These last two code examples should look familiar because they follow the pattern in the first step where you created the PYTHON_API_DB database.

  6. To confirm that the objects were created, return to your Snowflake account in Snowsight.

检索对象数据

您可以检索有关 Snowflake 中对象的元数据。

  1. To retrieve details about the PYTHON_API_TABLE table you created previously, in your next notebook cell, run the following code:

    table_details = table.fetch()

    fetch() returns a TableModel object.

  2. You can then call .to_dict() on the resulting object to view its detailed information.

要打印表详细信息,请在下一个单元格中运行以下代码:

table_details.to_dict()

The notebook should display a dictionary that contains metadata about the PYTHON_API_TABLE table, similar to this:

{
 "name": "PYTHON_API_TABLE",
 "kind": "TABLE",
 "enable_schema_evolution": False,
 "change_tracking": False,
 "data_retention_time_in_days": 1,
 "max_data_extension_time_in_days": 14,
 "default_ddl_collation": "",
 "columns": [
     {"name": "TEMPERATURE", "datatype": "NUMBER(38,0)", "nullable": False},
     {"name": "LOCATION", "datatype": "VARCHAR(16777216)", "nullable": True},
 ],
 "created_on": datetime.datetime(
     2024, 5, 9, 8, 59, 15, 832000, tzinfo=datetime.timezone.utc
 ),
 "database_name": "PYTHON_API_DB",
 "schema_name": "PYTHON_API_SCHEMA",
 "rows": 0,
 "bytes": 0,
 "owner": "ACCOUNTADMIN",
 "automatic_clustering": False,
 "search_optimization": False,
 "owner_role_type": "ROLE",
}

As shown, this dictionary contains information about the PYTHON_API_TABLE table that you created previously, with detailed information about columns, owner, database, schema, and more.

Object metadata is useful when you are building business logic in your application. For example, you might build logic that runs depending on certain information about an object. You can use fetch() to retrieve object metadata in such scenarios.

以编程方式更改表

You can programmatically add a column to a table. The PYTHON_API_TABLE table currently has two columns: TEMPERATURE and LOCATION. In this scenario, you want to add a new column named ELEVATION of type int and set it as the table’s primary key.

  1. 请在下一个单元格中运行以下代码:

    table_details.columns.append(
     TableColumn(
       name="elevation",
       datatype="int",
       nullable=False,
       constraints=[PrimaryKey()],
     )
    )

    Note

    This code does not create the column. Instead, this column definition is appended to the array that represents the table’s columns in the TableModel. To view this array, review the value of columns as described in the instructions for viewing the table metadata.

  2. 要修改表并添加列,请在下一个单元格中运行以下代码:

    table.create_or_alter(table_details)

    In this line, you call create_or_alter() on the object representing PYTHON_API_TABLE, and pass the updated value of table_details. This line adds the ELEVATION column to PYTHON_API_TABLE.

  3. 要确认列已添加,请在下一个单元格中运行以下代码:

    table.fetch().to_dict()

输出应类似于以下内容:

{
 "name": "PYTHON_API_TABLE",
 "kind": "TABLE",
 "enable_schema_evolution": False,
 "change_tracking": False,
 "data_retention_time_in_days": 1,
 "max_data_extension_time_in_days": 14,
 "default_ddl_collation": "",
 "columns": [
     {"name": "TEMPERATURE", "datatype": "NUMBER(38,0)", "nullable": False},
     {"name": "LOCATION", "datatype": "VARCHAR(16777216)", "nullable": True},
     {"name": "ELEVATION", "datatype": "NUMBER(38,0)", "nullable": False},
 ],
 "created_on": datetime.datetime(
     2024, 5, 9, 8, 59, 15, 832000, tzinfo=datetime.timezone.utc
 ),
 "database_name": "PYTHON_API_DB",
 "schema_name": "PYTHON_API_SCHEMA",
 "rows": 0,
 "bytes": 0,
 "owner": "ACCOUNTADMIN",
 "automatic_clustering": False,
 "search_optimization": False,
 "owner_role_type": "ROLE",
 "constraints": [
     {"name": "ELEVATION", "column_names": ["ELEVATION"], "constraint_type": "PRIMARY KEY"}
 ]
}

Review the value of columns and the value of constraints, both of which now include the ELEVATION column.

  1. To confirm the existence of the new column, return to your Snowflake account in Snowsight and inspect the table.

创建和管理仓库

You can also manage virtual warehouses with the Snowflake Python APIs. For example, you might need to create another warehouse temporarily to run certain queries. In this scenario, you can use the API to create, suspend, or drop a warehouse.

  1. 要检索与会话相关的仓库集合,请在下一个单元格中运行以下代码:

    warehouses = root.warehouses

    You manage warehouses in your session using the resulting warehouses object.

  2. 要定义和创建一个新仓库,请在下一个单元格中运行以下代码:

    python_api_wh = Warehouse(
     name="PYTHON_API_WH",
     warehouse_size="SMALL",
     auto_suspend=500,
    )
    
    warehouse = warehouses.create(python_api_wh,mode=CreateMode.or_replace)

    In this code, you define a new warehouse by instantiating Warehouse and specifying the warehouse’s name, size, and auto-suspend policy. The auto-suspend timeout is in units of seconds. In this case, the warehouse will be suspended after 8.33 minutes of inactivity.

    You then create the warehouse by calling create() on your warehouse collection. You store the reference in the resulting warehouse object.

  3. Navigate to your Snowflake account in Snowsight and confirm that the warehouse was created.

  4. 要检索有关仓库的信息,请在下一个单元格中运行以下代码:

    warehouse_details = warehouse.fetch()
    warehouse_details.to_dict()

这段代码应该看起来很熟悉,因为它采用了在之前的步骤中提取表元数据时使用的模式。输出应类似于以下内容:

{
  'name': 'PYTHON_API_WH',
  'auto_suspend': 500,
  'auto_resume': 'true',
  'resource_monitor': 'null',
  'comment': '',
  'max_concurrency_level': 8,
  'statement_queued_timeout_in_seconds': 0,
  'statement_timeout_in_seconds': 172800,
  'tags': {},
  'warehouse_type': 'STANDARD',
  'warehouse_size': 'Small'
}
  1. 可选:如果在会话中有多个仓库,请使用 API 来对其循环访问或搜索特定仓库。

请在下一个单元格中运行以下代码:

warehouse_list = warehouses.iter(like="PYTHON_API_WH")
result = next(warehouse_list)
result.to_dict()

In this code, you call iter() on the warehouse collection and pass the like argument, which returns any warehouses whose names match the specified string. In this case, you pass the name of the warehouse you defined previously, but this argument is generally a case-insensitive string that functions as a filter, with support for SQL wildcard characters like % and _.

在运行单元格后,类似以下代码的输出表明您成功返回了匹配的仓库:

{
  'name': 'PYTHON_API_WH',
  'auto_suspend': 500,
  'auto_resume': 'true',
  'resource_monitor': 'null',
  'comment': '',
  'max_concurrency_level': 8,
  'statement_queued_timeout_in_seconds': 0,
  'statement_timeout_in_seconds': 172800,
  'tags': {},
  'warehouse_type': 'STANDARD',
  'warehouse_size': 'Small'
}
  1. To programmatically modify the warehouse by changing its size to LARGE, in your next cell, run the following code:

    warehouse = root.warehouses.create(Warehouse(
     name="PYTHON_API_WH",
     warehouse_size="LARGE",
     auto_suspend=500,
    ), mode=CreateMode.or_replace)
  2. To confirm that the warehouse size was updated to LARGE, do one of the following:

    • 请在下一个单元格中运行以下代码:

      warehouse.fetch().size
    • Navigate to your Snowflake account in Snowsight and confirm the change in warehouse size.

  3. 可选:如果您不想继续使用仓库,请将其删除。请在下一个单元格中运行以下代码:

    warehouse.drop()
  4. To confirm the warehouse deletion, return to your Snowflake account in Snowsight.

下一步是什么?

Congratulations! In this tutorial, you learned the fundamentals for managing Snowflake resource objects using the Snowflake Python APIs.

摘要

在此过程中,您完成了以下步骤:

  • Install the Snowflake Python APIs.
  • 设置与 Snowflake 的连接。
  • 创建数据库、架构和表。
  • 检索对象信息。
  • 以编程方式更改对象。
  • 创建、暂停和删除仓库。

下一个教程

You can now proceed to Tutorial 2: Create and manage tasks and task graphs (DAGs), which shows how to create and manage tasks and task graphs.

其他资源

要获取更多使用 API 管理 Snowflake 中其他类型对象的示例,请参阅以下开发者指南:

GuideDescription
Managing Snowflake users, roles, and grants with PythonUse the API to create and manage users, roles, and grants.
Managing data loading and unloading resources with PythonUse the API to create and manage data loading and unloading resources, including external volumes, pipes, and stages.
Managing Snowflake tasks and task graphs with PythonUse the API to create, execute, and manage tasks and task graphs.
Managing Snowpark Container Services (including service functions) with PythonUse the API to manage components of Snowpark Container Services, including compute pools, image repositories, services, and service functions.