CREATE CATALOG INTEGRATION (AWS Glue)

Creates a new catalog integration in the account or replaces an existing catalog integration for Apache Iceberg™ tables that use AWS Glue as the catalog.

Important

To integrate with AWS Glue, we recommend that you instead create a catalog integration for the AWS Glue Iceberg REST endpoint (https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html), which supports additional Iceberg table features such as catalog-vended credentials.

For instructions, see CREATE CATALOG INTEGRATION (Apache Iceberg™ REST).

Note

When you create a catalog integration for AWS Glue, you must complete additional steps to establish a trust relationship between Snowflake and the Glue Data Catalog. For information, see Configure a catalog integration for AWS Glue.

See also:

ALTER CATALOG INTEGRATION , DROP CATALOG INTEGRATION , SHOW CATALOG INTEGRATIONS, DESCRIBE CATALOG INTEGRATION

语法

CREATE [ OR REPLACE ] CATALOG INTEGRATION [IF NOT EXISTS]
  <name>
  CATALOG_SOURCE = GLUE
  TABLE_FORMAT = ICEBERG
  GLUE_AWS_ROLE_ARN = '<arn-for-AWS-role-to-assume>'
  GLUE_CATALOG_ID = '<glue-catalog-id>'
  [ GLUE_REGION = '<AWS-region-of-the-glue-catalog>' ]
  [ CATALOG_NAMESPACE = '<catalog-namespace>' ]
  ENABLED = { TRUE | FALSE }
  [ REFRESH_INTERVAL_SECONDS = <value> ]
  [ COMMENT = '<string_literal>' ]

必填参数

name

用于指定目录集成的标识符(名称)的字符串;在账户中必须唯一。

In addition, the identifier must start with an alphabetic character and cannot contain spaces or special characters unless the entire identifier string is enclosed in double quotes (for example, "My object"). Identifiers enclosed in double quotes are also case sensitive.

For more information, see Identifier requirements.

CATALOG_SOURCE = GLUE

指定该集成用于 AWS Glue。

TABLE_FORMAT = ICEBERG

指定 Glue Iceberg 表。

GLUE_AWS_ROLE_ARN = 'arn-for-AWS-role-to-assume'

指定要承担的 AWS Identity and Access Management (IAM) 角色的 Amazon Resource Name (ARN)。

GLUE_CATALOG_ID = 'glue-catalog-id'

指定 AWS 账户的 ID。

ENABLED = { TRUE | FALSE }

指定目录集成是否可用于 Iceberg 表。

  • TRUE lets users create new Iceberg tables that reference this integration. Existing Iceberg tables that reference this integration function normally.
  • FALSE prevents users from creating new Iceberg tables that reference this integration. Existing Iceberg tables that reference this integration cannot access the catalog in the table definition.

The value is case-insensitive.

The default is TRUE.

可选参数

[ GLUE_REGION = 'AWS-region-of-the-glue-catalog' ]

指定 AWS Glue Data Catalog 的 AWS 区域。如果 Snowflake 账户未托管在 AWS 上,则必须为此参数指定一个值。否则,默认区域是账户的 Snowflake 部署区域。

CATALOG_NAMESPACE = 'catalog-namespace'

Specifies your AWS Glue Data Catalog namespace (for example, my_glue_database). This is the default namespace for all Iceberg tables that you associate with this catalog integration.

  • 如果已指定,您可以在创建表时在表级别指定命名空间,从而替换此值。
  • 如果未指定,则在创建表时必须在表级别指定命名空间。
REFRESH_INTERVAL_SECONDS = value

Specifies the number of seconds that Snowflake waits between attempts to poll the external Iceberg catalog for metadata updates for automated refresh.

For Delta-based tables, specifies the number of seconds that Snowflake waits between attempts to poll your external cloud storage for new metadata.

Values: 30 to 86400, inclusive

Default: 30 seconds

COMMENT = 'string_literal'

字符串(字面量),用于指定集成注释。

默认:无值

访问控制要求

A role used to execute this operation must have the following privileges at a minimum:

权限对象备注
CREATE INTEGRATION账户Only the ACCOUNTADMIN role has this privilege by default. The privilege can be granted to additional roles as needed.

For instructions on creating a custom role with a specified set of privileges, see Creating custom roles.

For general information about roles and privilege grants for performing SQL actions on securable objects, see Overview of Access Control.

使用说明

  • 您无法修改现有目录集成;请改用 CREATE OR REPLACE CATALOG INTEGRATION 语句。

  • You can’t drop or replace a catalog integration if one or more Apache Iceberg™ tables are associated with the catalog integration.

    To view the tables that depend on a catalog integration, you can use the SHOW ICEBERG TABLES command and a query using the pipe operator (->>) that filters on the catalog_name column.

    Note

    The column identifier (catalog_name) is case-sensitive. Specify the column identifier exactly as it appears in the SHOW ICEBERG TABLES output.

    For example:

    SHOW ICEBERG TABLES
      ->> SELECT *
            FROM $1
            WHERE "catalog_name" = 'my_catalog_integration_1';
  • 关于元数据:

    Attention

    Customers should ensure that no personal data (other than for a User object), sensitive data, export-controlled data, or other regulated data is entered as metadata when using the Snowflake service. For more information, see Metadata fields in Snowflake.

  • The OR REPLACE and IF NOT EXISTS clauses are mutually exclusive. They can’t both be used in the same statement.
  • CREATE OR REPLACE <object> statements are atomic. That is, when an object is replaced, the old object is deleted and the new object is created in a single transaction.

示例

The following example creates a catalog integration that uses an AWS Glue catalog source. When you create a catalog integration for Glue, you must complete additional steps to establish a trust relationship between Snowflake and the Glue Data Catalog. For information, see Configure a catalog integration for AWS Glue.

CREATE CATALOG INTEGRATION glueCatalogInt
  CATALOG_SOURCE = GLUE
  CATALOG_NAMESPACE = 'myNamespace'
  TABLE_FORMAT = ICEBERG
  GLUE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/myGlueRole'
  GLUE_CATALOG_ID = '123456789012'
  GLUE_REGION = 'us-east-2'
  ENABLED = TRUE;