为 Apache Iceberg™ 表使用目录链接的数据库¶

通过目录链接的数据库，您可以在 Snowflake 中访问多个远程 Iceberg 表，而无需创建单独的外部管理表。

目录链接的数据库是一种连接到外部 Iceberg REST 目录的 Snowflake 数据库。Snowflake 会自动与外部目录同步以检测命名空间和 Iceberg 表，并将远程表注册到目录链接的数据库。目录链接的数据库还支持创建和删除架构或 Iceberg 表。

Billing for catalog-linked databases¶

Snowflake bills your account for the following usage:

Automatic table discovery, create schema, drop schema, and drop table. Snowflake will bill your account for this usage under the CREDITS_USED_CLOUD_SERVICES usage type. Usage for cloud services is charged only if the daily consumption of cloud services exceeds 10% of the daily usage of virtual warehouses. For more information, see 了解云服务使用计费.
Create table. Snowflake will bill your account for this usage under the CREDITS_USED_COMPUTE usage type through auto refresh. The cost for this usage is described in Table 5 of the Snowflake service consumption table on the Snowflake website. Refer to the Snowflake-managed compute column for the Automated Refresh and Data Registration row.

Snowflake won't bill you for any cloud services that you use during table creation.

备注

To view the credit usage for your catalog-linked databases, use the CATALOG_LINKED_DATABASE_USAGE_HISTORY 视图.

Workflow to configure access to your external catalog and table storage¶

以下步骤介绍如何创建目录链接的数据库、检查 Snowflake 与目录之间的同步状态，以及在数据库中创建或查询表。

Configure access to your external catalog and table storage¶

Before you create a catalog-linked database, you need to configure access to your external catalog and table storage. To configure this access, you configure a catalog integration with vended credentials. With this option, your remote Iceberg catalog must support credential vending.

有关说明，请参阅为 Apache Iceberg™ 表使用由目录分发的凭据。

备注

If your remote Iceberg catalog doesn't support credential vending, you must configure an external volume and a catalog integration to configure access to your external catalog and table storage. First, configure an external volume for your cloud storage provider. Then, configure a Apache Iceberg™ REST catalog integration for your remote Iceberg catalog.

创建目录链接数据库¶

使用 CREATE DATABASE（目录链接）命令创建目录链接的数据库：

The following example creates a catalog-linked database that uses vended credentials. The sync interval is 30 seconds, which is the default. The sync interval tells Snowflake how often to poll your remote catalog.

CREATE DATABASE my_linked_db
  LINKED_CATALOG = (
    CATALOG = 'my_catalog_int'
  );

Copy

备注

To create a catalog-linked database that uses an external volume, see CREATE DATABASE（目录链接）, including the example.

Your catalog-linked database includes a link icon.

检查目录同步状态¶

要检查 Snowflake 是否已成功将远程目录链接至数据库，请使用 SYSTEM$CATALOG_LINK_STATUS 函数。

该函数还可提供相关信息，帮助您识别远程目录中同步失败的表。

SELECT SYSTEM$CATALOG_LINK_STATUS('my_linked_db');

Copy

Identify tables that were created but couldn't be initialized¶

To identify tables in the remote catalog that synced successfully but fail to refresh automatically, run the SHOW ICEBERG TABLES command, and then refer to the auto_refresh_status column in the output. These tables have an executionState of ICEBERG_TABLE_NOT_INITIALIZED in the output.

For example, Snowflake might successfully discover and create a table in your remote catalog to your catalog-linked database, but this table has a corrupted data file in your remote catalog. As a result, Snowflake can't automatically refresh the table until you resolve the error.

Automated refresh is turned off for these kinds of tables, so querying the table in Snowflake returns an error that says the table was never initialized. To query the table, you must fix the error, and then turn on automated refresh for the table.

Query a table in your catalog-linked database¶

创建目录链接的数据库后，Snowflake 会启动表发现过程，并自动使用 SYNC_INTERVAL_SECONDS 参数值（默认间隔为 30 秒）轮询已链接的目录以检查变更。

在数据库中，远程目录的允许命名空间将显示为架构，而 Iceberg 表将出现在其各自的架构之下。

您可以使用 SELECT 语句查询远程表。

备注

For the requirements for identifying objects in a catalog-linked database, see Requirements for identifier resolution in a catalog-linked database.

有关对象标识符的更多信息，请参阅标识符要求。

例如：

USE DATABASE my_linked_db;

SELECT * FROM my_namespace.my_iceberg_table
  LIMIT 20;

Copy

写入远程目录¶

您可以使用 Snowflake 在已链接目录中创建命名空间和 Iceberg 表。有关详细信息，请参阅以下主题：

Requirements for identifier resolution in a catalog-linked database¶

The requirement for resolving an identifier depends on the following:

The value that you specified for the CATALOG_CASE_SENSITIVITY parameter when you created your catalog-linked database
Whether your external Iceberg catalog uses case-sensitive or case-insensitive identifiers.

备注

These requirements apply to identifying existing schemas, tables, and table columns. They also include some special cases for creating or altering an object.
When you create a new schema, table, or column in a case-sensitive catalog such as AWS Glue or Unity Catalog, you must use lowercase letters and surround the schema, table, and column names in double quotes. This is also required for other Iceberg REST catalogs that only support lowercase identifiers.

The following table shows the requirement for each scenario:

CATALOG_CASE_SENSITIVITY value	External Iceberg catalog uses	Requirement
CASE_SENSITIVE	Case sensitive identifiers	Snowflake matches identifiers exactly as they appear, including case. Snowflake automatically converts unquoted identifiers to uppercase, but quoted identifiers must match exactly the case in your external catalog. The following example shows a valid query for creating a table: CREATE TABLE "Table1" (id INT, name STRING); Copy Snowflake creates the table in the external catalog as `Table1`, which preserves the capitalization you used. Note that you can also create a lowercase `table1` table, if needed. The following example shows a valid query for selecting the `Table1` table: SELECT * FROM "Table1"; Copy In the previous example, the double quotes are required for matching the capitalization exactly. The following example shows an invalid query, unless a `TABLE1` table exists: SELECT * FROM table1; Copy In the previous example, the query is invalid if `TABLE1` doesn't exist because the identifier isn't surrounded with double quotes. As a result, Snowflake converts the identifier to uppercase. The following example shows an invalid query for the case when an all uppercase `TABLE1` doesn't exist: SELECT * FROM TABLE1; Copy
CASE_SENSITIVE	Case insensitive identifiers	If the external Iceberg catalog is actually case insensitive, and normalizes to lowercase, you must surround identifiers in double quotes. The following example shows valid queries: SELECT * from "s1"; SELECT * from "lowercasetablename"; Copy
CASE_INSENSITIVE	Case insensitive identifiers	If your case insensitive catalog has a lowercase `table1` table, all of the following queries are valid: SELECT * from table1; SELECT * from TABLE1; SELECT * from Table1; SELECT * from "table1"; Copy For any of the following commands, you must surround the schema, table, and column names in double quotes: CREATE ICEBERG TABLE CREATE SCHEMA ALTER ICEBERG TABLE ADD COLUMN ALTER ICEBERG TABLE RENAME COLUMN
CASE_INSENSITIVE	Case sensitive identifiers	If the external Iceberg catalog is actually case sensitive, Snowflake treats unquoted identifiers as case-insensitive and automatically converts unquoted identifiers to uppercase. When you create or query objects, Snowflake matches identifiers regardless of case, as long as they are unquoted. Using this pattern is discouraged because Snowflake can't resolve two different identifiers that differ in casing. This pattern only works when no two identifiers are different in casing only. Consider the case where the remote catalog has a `Table1` table. All of the following queries are valid for querying that table. SELECT * from table1; SELECT * from TABLE1; SELECT * from Table1; SELECT * from "Table1"; Copy Quoted identifiers preserve case and match exactly. However, in CASE_INSENSITIVE mode, unquoted and quoted forms are both supported.

Considerations for using a catalog-linked database for Iceberg tables¶

Consider the following items when you use a catalog-linked database: