为 AWS Glue Iceberg REST 配置目录集成¶
Follow the steps in this topic to create a catalog integration for the AWS Glue Iceberg REST endpoint (https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html) with Signature Version 4 (SigV4) (https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_aws-signing.html) authentication.
Note
To configure a catalog integration for connecting to AWS Glue Data Catalog through a private IP address instead of over the public internet, see Configure an Apache Iceberg™ REST catalog integration with outbound private connectivity.
第 1 步:配置 AWS Glue Data Catalog 的访问权限¶
Create an IAM policy for Snowflake to access the AWS Glue Data Catalog. Attach the policy to an IAM role, which you specify when you create a catalog integration. For instructions, see Creating IAM policies (https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create-console.html) and Modifying a role permissions policy (https://docs.aws.amazon.com/IAM/latest/UserGuide/roles-managingrole-editing-console.html#roles-modify_permissions-policy) in the AWS Identity and Access Management User Guide.
只读示例策略
Snowflake 至少需要 AWS Glue Data Catalog 的以下权限才能使用 Glue Iceberg REST 目录访问信息。
glue:GetCatalogglue:GetDatabaseglue:GetDatabasesglue:GetTableglue:GetTables
以下示例策略(JSON 格式)提供了访问指定数据库中所有表所需的权限。
Note
- You can modify the
Resourceelement of this policy to further restrict the allowed resources (for example, catalog, databases, or tables). For more information, see Resource types defined by AWS Glue (https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsglue.html#awsglue-resources-for-iam-policies). - If you use encryption for AWS Glue, you must modify the policy to add AWS Key Management Service (AWS KMS) permissions. For more information, see Setting up encryption in AWS Glue (https://docs.aws.amazon.com/glue/latest/dg/set-up-encryption.html).
读写入示例策略
The following example policy (in JSON format) provides the required permissions for read and write access to all of the tables in all databases. To configure write access for externally managed tables, use this policy as an example.
Note
- 策略必须提供对存储位置的访问权限,以便 AWS Glue 目录能将元数据写入表位置。
- The
"arn:aws:glue:*:<accountid>:database/*"line in theResourceelement of this policy specifies all databases. This is required if you want to create a new database in Glue from Snowflake with the CREATE SCHEMA command. To limit access to a single database, you can specify the database by name. For more information about defining resources, see Resource types defined by AWS Glue (https://docs.aws.amazon.com/service-authorization/latest/reference/list_awsglue.html#awsglue-resources-for-iam-policies). - If you use encryption for AWS Glue, you must modify the policy to add AWS Key Management Service (AWS KMS) permissions. For more information, see Setting up encryption in AWS Glue (https://docs.aws.amazon.com/glue/latest/dg/set-up-encryption.html).
(可选)配置 Lake Formation 访问控制¶
如果您使用 AWS Lake Formation 进行精细访问控制,请确保您的 Lake Formation 配置允许 Snowflake 访问目录对象及其底层数据。
The IAM role that you created in the previous step — the role that you specify in Snowflake when you create a catalog integration — must
have the lakeformation:GetDataAccess IAM permission. This permission grants read and write access to underlying data:
For more information, see Underlying data access control (https://docs.aws.amazon.com/lake-formation/latest/dg/access-control-underlying-data.html) in the Lake Formation documentation.
You must also grant data permissions to the IAM role. The method that you use to grant data permissions depends on your Lake Formation setup. For example, you might use the named resources method to grant permissions to AWS Glue objects, or you might use tag-based access control. For more information and instructions, see the AWS Lake Formation documentation (https://docs.aws.amazon.com/lake-formation/latest/dg/granting-catalog-permissions.html).
第 2 步:在 Snowflake 中创建目录集成¶
Create a catalog integration for the
AWS Glue Iceberg REST endpoint (https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html)
using the CREATE CATALOG INTEGRATION (Apache Iceberg™ REST) command.
Specify the IAM role that you configured. For CATALOG_NAME, use your AWS account ID.
其中:
CATALOG_URIis the service endpoint for the AWS Glue Iceberg REST catalog.CATALOG_NAMEis the ID of your AWS account.
For more information, see CREATE CATALOG INTEGRATION (Apache Iceberg™ REST), which includes instructions for configuring a catalog integration for AWS Glue.
第 3 步:检索 Snowflake 账户的 AWS IAM 用户和外部 ID¶
To retrieve information about the AWS IAM user and the external ID for your Snowflake account, run the DESCRIBE CATALOG INTEGRATION command. You provide this information to AWS in the next step to establish a trust relationship.
记录以下值:
Value Description GLUE_AWS_IAM_USER_ARNThe AWS IAM user created for your Snowflake account, for example, arn:aws:iam::123456789001:user/abc1-b-self1234. Snowflake provisions a single IAM user for your entire Snowflake account. All Glue catalog integrations in your account use that IAM user.GLUE_AWS_EXTERNAL_IDAn external ID for establishing a trust relationship.
第 4 步:授予 IAM 用户访问 AWS Glue 数据目录的权限¶
Update the trust policy for the same IAM role that you specified with the ARN when you created the
catalog integration (GLUE_AWS_ROLE_ARN). Add the values that you recorded in the
previous step to the trust policy.
For instructions, see Modifying a trust policy (https://docs.aws.amazon.com/IAM/latest/UserGuide/roles-managingrole-editing-console.html#roles-managingrole_edit-trust-policy).
The following example policy shows where to specify the GLUE_AWS_IAM_USER_ARN and GLUE_AWS_EXTERNAL_ID values:
其中:
glue_iam_user_arnis theGLUE_IAM_USER_ARNvalue that you recorded.glue_aws_external_idis theGLUE_AWS_EXTERNAL_IDvalue that you recorded.
Note
- For security reasons, if you create a new catalog integration (or recreate an existing catalog integration by using the CREATE OR REPLACE CATALOG INTEGRATION syntax), the new catalog integration has a different external ID and can’t resolve the trust relationship unless you modify the trust policy with the new external ID.
- To verify that your permissions are configured correctly, create an Iceberg table that uses this catalog integration. Snowflake doesn’t verify that your permissions are set correctly until you create an Iceberg table that references this catalog integration.
后续步骤
After you configure a catalog integration for AWS Glue Iceberg REST, you can create a catalog-linked database. Specify the name of your catalog integration as the catalog when you create your catalog-linked database.
与目录关联的数据库通过自动发现远程 Iceberg REST 目录中的命名空间和表,并保持同步,将外部数据引入 Snowflake。