CREATE ICEBERG TABLE（Iceberg REST 目录）¶

在当前/指定架构中为 Iceberg REST 目录创建或替换 Apache Iceberg™ 表。

在以下场景中使用此命令：

您想要使用符合开源 Apache Iceberg REST OpenAPI 规范 (https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml) 的远程 Iceberg 目录。
您想要查询 Snowflake Open Catalog 或 Apache Polaris™ 中的表。有关更多信息，请参阅使用 Snowflake 查询 Snowflake Open Catalog 中的表。
You want to create an externally managed table in a catalog-linked database. See CREATE ICEBERG TABLE（目录链接的数据库）.

备注

在创建表之前，必须创建存储 Iceberg 元数据和数据文件的外部卷。有关说明，请参阅配置外部卷。

您还需要为表配置目录集成。有关更多信息，请参阅为 Apache Iceberg™ REST 目录配置目录集成或为 Snowflake Open Catalog 配置目录集成。

另请参阅：: ALTER ICEBERG TABLE、DROP ICEBERG TABLE、SHOW ICEBERG TABLES、DESCRIBE ICEBERG TABLE、UNDROP ICEBERG TABLE

语法¶

CREATE [ OR REPLACE ] ICEBERG TABLE [ IF NOT EXISTS ] <table_name>
  [ EXTERNAL_VOLUME = '<external_volume_name>' ]
  [ CATALOG = '<catalog_integration_name>' ]
  CATALOG_TABLE_NAME = '<rest_catalog_table_name>'
  [ CATALOG_NAMESPACE = '<catalog_namespace>' ]
  [ PARTITION BY ( partitionExpression [ , partitionExpression , ... ] ) ]
  [ TARGET_FILE_SIZE = '{ AUTO | 16MB | 32MB | 64MB | 128MB }' ]
  [ REPLACE_INVALID_CHARACTERS = { TRUE | FALSE } ]
  [ AUTO_REFRESH = { TRUE | FALSE } ]
  [ COMMENT = '<string_literal>' ]
  [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
  [ WITH CONTACT ( <purpose> = <contact_name> [ , <purpose> = <contact_name> ... ] ) ]
  [ STORAGE_SERIALIZATION_POLICY = { COMPATIBLE | OPTIMIZED } ]

Copy

其中：

partitionExpression ::=
  <col_name> -- identity transform
  | BUCKET ( <num_buckets> , <col_name> )
  | TRUNCATE ( <width> , <col_name> )
  | YEAR ( <col_name> )
  | MONTH ( <col_name> )
  | DAY ( <col_name> )
  | HOUR ( <col_name> )

Copy

变体语法¶

CREATE ICEBERG TABLE（目录链接的数据库）¶

CREATE [ OR REPLACE ] ICEBERG TABLE [ IF NOT EXISTS ] <table_name>
  [
    --Column definition
    <col_name> <col_type>
      [ [ WITH ] MASKING POLICY <policy_name> [ USING ( <col_name> , <cond_col1> , ... ) ] ]

    -- Additional column definitions
    [ , <col_name> <col_type> [ ... ] ]
  ]
  [ PARTITION BY ( partitionExpression [ , partitionExpression , ... ] ) ]
  [ TARGET_FILE_SIZE = '{ AUTO | 16MB | 32MB | 64MB | 128MB }' ]
  [ MAX_DATA_EXTENSION_TIME_IN_DAYS = <integer> ]
  [ AUTO_REFRESH = { TRUE | FALSE } ]
  [ REPLACE_INVALID_CHARACTERS = { TRUE | FALSE } ]
  [ COPY GRANTS ]
  [ COMMENT = '<string_literal>' ]
  [ [ WITH ] TAG ( <tag_name> = '<tag_value>' [ , <tag_name> = '<tag_value>' , ... ] ) ]
  [ BASE_LOCATION = '<path_to_directory_for_table_files>' ]
  [ STORAGE_SERIALIZATION_POLICY = { COMPATIBLE | OPTIMIZED } ]

Copy

其中：

partitionExpression ::=
  <col_name> -- identity transform
  | BUCKET ( <num_buckets> , <col_name> )
  | TRUNCATE ( <width> , <col_name> )
  | YEAR ( <col_name> )
  | MONTH ( <col_name> )
  | DAY ( <col_name> )
  | HOUR ( <col_name> )

Copy

CREATE ICEBERG TABLE（目录链接的数据库）... AS SELECT¶

CREATE [ OR REPLACE ] ICEBERG TABLE <table_name> [ ( <col_name> [ <col_type> ] , <col_name> [ <col_type> ] , ... ) ]
  [ ... ]
  AS SELECT <query>

Copy

您可以将掩码策略应用于 CTAS 语句中的列。先指定列数据类型，然后指定掩码策略。例如：

CREATE [ OR REPLACE ] ICEBERG TABLE <table_name> ( <col1> <data_type> [ WITH ] MASKING POLICY <policy_name> [ , ... ] )
  [ ... ]
  AS SELECT <query>

Copy

必填参数¶

table_name

指定 Snowflake 中的表的标识符（名称）；对于在其中创建表的架构而言，必须是唯一的。

此外，标识符必须以字母字符开头，且不能包含空格或特殊字符，除非整个标识符字符串放在双引号内（例如，"My object"）。放在双引号内的标识符也区分大小写。

有关更多信息，请参阅标识符要求。

备注

要检索远程目录中的表或命名空间列表，可以使用以下函数：

CATALOG_TABLE_NAME = 'rest_catalog_table_name'

指定外部目录可识别的表名称。创建表后，此参数无法更改。

备注

请勿使用表名称 (mynamespace.mytable) 指定命名空间。若要为此表指定命名空间，并替换对目录集成设置的默认命名空间，请使用 CATALOG_NAMESPACE 参数。

col_name

用于在目录链接的数据库中创建表（预览版）。

指定列标识符（名称）。表标识符的所有要求也适用于列标识符。

有关更多信息，请参阅标识符要求和保留和受限关键字。

备注

除了标准的保留关键字之外，以下关键字不能用作列标识符，因为它们是为 ANSI 标准上下文函数保留的：

CURRENT_DATE
CURRENT_ROLE
CURRENT_TIME
CURRENT_TIMESTAMP
CURRENT_USER

有关保留关键字的列表，请参阅保留和受限关键字。

col_type

用于在目录链接的数据库中创建表（预览版）。

指定列的数据类型。

有关可为表列指定的数据类型的信息，请参阅 Apache Iceberg™ 表的数据类型。

可选参数¶

PARTITION BY = ( partitionExpression [ , partitionExpression , ... ] )

Specifies one or more partition expressions.

MASKING POLICY = policy_name

用于在目录链接的数据库中创建表（预览版）。

指定要在列上设置的掩码策略。掩码策略必须属于标准的 Snowflake 数据库（而不是目录链接的数据库）。

EXTERNAL_VOLUME = 'external_volume_name'

指定外部卷的标识符（名称），Iceberg 表会以 Parquet 格式将自身的元数据文件和数据存储在里面。Iceberg 元数据和清单文件存储表架构、分区、快照和其他元数据。

如果未指定此参数，则 Iceberg 表将默认使用架构、数据库或账户的外部卷。架构优先于数据库，数据库优先于账户。

CATALOG = 'catalog_integration_name'

指定此表的目录集成的标识符（名称）。

如果未指定此参数，则 Iceberg 表将默认使用架构、数据库或账户的目录集成。架构优先于数据库，数据库优先于账户。

CATALOG_NAMESPACE = 'catalog_namespace'

（可选）指定 REST 目录源的命名空间（例如，my_database）。通过在目录集成中指定命名空间，然后在表级别指定，您可以使用单个 REST 目录集成在不同数据库中创建 Iceberg 表。如果您没有在表中指定命名空间，则该表使用与目录集成关联的默认目录命名空间。
如果在目录集成中未指定默认命名空间，则必须为 REST 目录源指定命名空间，以设置表的目录命名空间。

备注

要检索远程目录中的表或命名空间列表，可以使用以下函数：

TARGET_FILE_SIZE = '{ AUTO | 16MB | 32MB | 64MB | 128MB }'

Specifies a target Parquet file size for the table.

'{ 16MB | 32MB | 64MB | 128MB }' specifies a fixed target file size for the table.
'AUTO' works differently, depending on the table type:
- Snowflake-managed tables: AUTO specifies that Snowflake should choose the file size for the table based on table characteristics such as size, DML patterns, ingestion workload, and clustering configuration. Snowflake automatically adjusts the file size, starting at 16 MB, for better read and write performance in Snowflake. Use this option to optimize table performance in Snowflake.
- Externally managed tables: AUTO specifies that Snowflake should aggressively scale to the largest file size (128 MB).

For more information, see Set a target file size.

Default: AUTO

MAX_DATA_EXTENSION_TIME_IN_DAYS = integer

对象参数，指定 Snowflake 可以延长表的数据保留期以防止表上的流过时的最大天数。

有关此参数的详细说明，请参阅 MAX_DATA_EXTENSION_TIME_IN_DAYS。

REPLACE_INVALID_CHARACTERS = { TRUE | FALSE }

指定在查询结果中是否使用 Unicode 替换字符 (�) 替换无效的 UTF-8 字符。您只能为使用外部 Tables 目录的表设置此参数。

TRUE 会将无效的 UTF-8 字符替换为 Unicode 替换字符。
FALSE 保留无效的 UTF-8 字符不变。如果在 Parquet 数据文件中遇到无效的 UTF-8 字符，Snowflake 会返回用户错误消息。

如果未指定，则 Iceberg 表将默认使用架构、数据库或账户的参数值。架构优先于数据库，数据库优先于账户。

默认：FALSE

AUTO_REFRESH = { TRUE | FALSE }

指定 Snowflake 是否应自动轮询与该表关联的外部 Iceberg 目录以获取元数据更新。

如果没有为目录集成上的 REFRESH_INTERVAL_SECONDS 参数指定值，则 Snowflake 将使用 30 秒作为默认刷新间隔。

有关更多信息，请参阅自动刷新。

默认：FALSE

备注

不支持将 AUTO_REFRESH 与 INFER_SCHEMA 一起使用。

COPY GRANTS

指定在使用以下 CREATE TABLE 变体创建新表时保留原始表的访问权限：

CREATE OR REPLACE TABLE

该参数将除 OWNERSHIP 之外的所有权限从现有表复制到新表。新表不会继承为架构中的对象类型定义的任何未来授权。默认情况下，执行 CREATE TABLE 语句的角色拥有新表。

如果该参数未包含在 CREATE ICEBERG TABLE 语句中，则新表不会继承在原始表上授予的任何显式访问权限，但会继承为架构中的对象类型定义的任何未来授权。

注意：

借助数据共享：
- 如果现有表已共享到另一个账户，则替换表也会共享。
- 如果现有表已作为数据使用者与您的账户共享，并且进一步授予了对账户中其他角色的访问权限（在父数据库上使用 GRANT IMPORTED PRIVILEGES），则还会授予对替换表的访问权限。
替换表的 SHOW GRANTS 输出会将复制权限的获得者列为执行 CREATE ICEBERG TABLE 语句的角色，并附带执行语句时的当前时间戳。
复制授权的操作在 CREATE ICEBERG TABLE 命令中会以原子方式发生（即在同一事务中）。

COMMENT = 'string_literal'

指定表的注释。

默认：无值

TAG ( tag_name = 'tag_value' [ , tag_name = 'tag_value' , ... ] )

指定标签名称和标签字符串值。

标签值始终为字符串，标签值的最大字符数为 256。

有关在语句中指定标签的信息，请参阅 Tag quotas。

WITH CONTACT ( purpose = contact [ , purpose = contact ...] ): 将新对象与一个或多个联系人关联。

BASE_LOCATION = 'path_to_directory_for_table_files'

The path to a directory, which Snowflake uses to construct write paths for the table's data and metadata files.

If you use an EXTERNAL_VOLUME, this path must be included with the storage paths that are specified for the external volume and you have the option to specify a relative path. If you specify a relative path, it is relative to the STORAGE_BASE_URL for the external volume. If not specified, Snowflake constructs a write path by using attributes such as the value of the BASE_LOCATION_PREFIX parameter and the table name.

If you're using vended credentials, you must also specify an absolute path.

备注

This directory can't be changed after you create a table.

Partition expression parameters (`partitionExpression`)¶

Snowflake supports all partition transforms in version 2 of the Apache Iceberg specification. For more information, see Partition Transforms (https://iceberg.apache.org/spec/#partition-transforms).

For more information about partitioning Iceberg tables, see Iceberg partitioning.

col_name

Specifies the identifier (name) for the source column to partition.

When used alone, without a transform such as YEAR, specifies an identity transform on the source column. For more information, see identity (https://iceberg.apache.org/spec/#partition-transforms).

BUCKET

Specifies a bucket transform. For more information, see Bucket Transform Details (https://iceberg.apache.org/spec/#bucket-transform-details).

num_buckets is the number of buckets to group the data into.

TRUNCATE

Specifies a truncate transform, which partitions the data based on the truncated values of the specified source column. For more information, see Truncate Transform Details (https://iceberg.apache.org/spec/#truncate-transform-details).

YEAR

Specifies a year transform, which extracts the year from a date or timestamp source-column value. For more information, see Partition Transforms (https://iceberg.apache.org/spec/#partition-transforms).

MONTH

Specifies a month transform. For more information, see Partition Transforms (https://iceberg.apache.org/spec/#partition-transforms).

DAY

Specifies a day transform, which extracts the day from a date or timestamp source-column value. For more information, see Partition Transforms (https://iceberg.apache.org/spec/#partition-transforms).

HOUR

Specifies an hour transform, which extracts the hour from a timestamp source-column value. For more information, see Partition Transforms (https://iceberg.apache.org/spec/#partition-transforms).

访问控制要求¶

用于执行此操作的角色必须至少具有以下权限：

权限	对象	备注
CREATE ICEBERG TABLE	架构
CREATE EXTERNAL VOLUME	账户	需要创建新的外部卷。
USAGE	外部卷	需要引用现有的外部卷。
CREATE INTEGRATION	账户	需要创建新的目录集成。
USAGE	目录集成	需要引用现有目录集成。

The USAGE privilege on the parent database and schema are required to perform operations on any object in a schema. Note that a role granted any privilege on a schema allows that role to resolve the schema. For example, a role granted CREATE privilege on a schema can create objects on that schema without also having USAGE granted on that schema.

有关创建具有指定权限集的自定义角色的说明，请参阅创建自定义角色。

有关对安全对象执行 SQL 操作的相应角色和权限授予的一般信息，请参阅访问控制概述。

使用说明¶

示例¶

创建使用远程 Iceberg REST 目录的 Iceberg 表¶

CREATE OR REPLACE ICEBERG TABLE my_iceberg_table
  EXTERNAL_VOLUME = 'my_external_volume'
  CATALOG = 'my_rest_catalog_integration'
  CATALOG_TABLE_NAME = 'my_remote_table'
  AUTO_REFRESH = TRUE;

Copy

创建 Iceberg 表来查询 Snowflake Open Catalog 中的表¶

此示例创建了一个 Iceberg 表，可用于使用 Snowflake 查询 Snowflake Open Catalog 中的表。

CREATE ICEBERG TABLE open_catalog_iceberg_table
  EXTERNAL_VOLUME = 'my_external_volume'
  CATALOG = 'open_catalog_int'
  CATALOG_TABLE_NAME = 'my_open_catalog_table'
  AUTO_REFRESH = TRUE;

Copy

在目录链接的数据库中创建 Iceberg 表¶

The following example creates a writable Iceberg table in a catalog-linked database with column definitions:

USE DATABASE my_catalog_linked_db;

USE SCHEMA 'my_namespace';

CREATE OR REPLACE ICEBERG TABLE my_iceberg_table (
  first_name string,
  last_name string,
  amount int,
  create_date date
);

Copy

Create a partitioned table in a catalog-linked database¶

The following example creates an externally managed Iceberg table by using the value of a timestamp column named start_date to partition the table by day:

USE DATABASE my_catalog_linked_db;

USE SCHEMA 'my_namespace';

CREATE OR REPLACE ICEBERG TABLE iceberg_partitioned_date_time (start_date timestamp)
  PARTITION BY (DAY(start_date));

Copy

You can insert data into the table by using supported table-loading features. For example, use an INSERT INTO statement to insert the following data into the empty iceberg_partitioned_date_time table created previously:

INSERT INTO iceberg_partitioned_date_time (start_date)
  VALUES
    (to_timestamp_ntz('2023-01-02 00:00:00')),
    (to_timestamp_ntz('2023-02-03 00:00:00')),
    (to_timestamp_ntz('2023-01-02 01:00:00')),
    (to_timestamp_ntz('2023-02-03 02:00:00'));

Copy

For more information, see Iceberg partitioning.

CREATE ICEBERG TABLE（Iceberg REST 目录）¶

语法¶

变体语法¶

CREATE ICEBERG TABLE（目录链接的数据库）¶

CREATE ICEBERG TABLE（目录链接的数据库）... AS SELECT¶

必填参数¶

可选参数¶

Partition expression parameters (partitionExpression)¶

访问控制要求¶

使用说明¶

示例¶

创建使用远程 Iceberg REST 目录的 Iceberg 表¶

创建 Iceberg 表来查询 Snowflake Open Catalog 中的表¶

在目录链接的数据库中创建 Iceberg 表¶

Create a partitioned table in a catalog-linked database¶

Partition expression parameters (`partitionExpression`)¶