适用于内部暂存区的 Azure Private Endpoint

本主题提供有关通过 Microsoft Azure Private Endpoint 连接到 Snowflake 内部暂存区的概念和详细说明。

概述

Azure private endpoints (https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-overview) and Azure Private Link (https://docs.microsoft.com/en-us/azure/private-link/private-link-overview) can be combined to provide secure connectivity to Snowflake internal stages. This setup ensures that data loading and data unloading operations to Snowflake internal stages use the Azure internal network and do not take place over the public internet.

在 Microsoft 支持使用 Azure Private Endpoint 进行内部暂存区访问之前,有必要在 Azure VNet 中创建一个代理场,以便安全访问 Snowflake 内部暂存区。通过增加对适用于 Snowflake 内部暂存区的 Azure Private Endpoint 的支持,用户和客户端应用程序现在可以通过专用 Azure 网络访问 Snowflake 内部暂存区。下图总结了这一新支持:

Connect to internal stage using Azure Private Link

Note the following regarding the numbers in the BEFORE diagram:

  • 用户可通过两种方式连接到 Snowflake 内部暂存区:

    • 选项 A 可通过本地连接直接连接到内部暂存区,如数字 1 所示。
    • 选项 B 可通过代理场连接到内部暂存区,如数字 2 和 3 所示。
  • 如果使用代理场,用户还可以直接连接到 Snowflake,如数字 4 所示。

Note the following regarding the numbers in the AFTER diagram:

  • 为清楚起见,该图显示了一个 Azure VNet 指向单个 Snowflake 内部暂存区(6 和 7)的单个 Azure Private Endpoint。

请注意,可以配置多个 Azure Private Endpoint,它们分别位于不同的 VNet 中,且指向同一 Snowflake 内部暂存区。

  • 此功能更新后,无需通过代理场即可连接到 Snowflake 或 Snowflake 内部暂存区。
  • 本地用户可以直接连接到 Snowflake,如数字 5 所示。
  • To connect to a Snowflake internal stage, on-premises user connects to a private endpoint, number 6, and then uses Azure Private Link to connect to the Snowflake internal stage as shown in number 7.

In Azure, each Snowflake account has a dedicated storage account to use as an internal stage. The storage account URIs are different depending on whether the connection to the storage account uses private connectivity (that is, Azure Private Link). The private connectivity URL includes a privatelink segment in the URL.

Public storage account URI:

<storage_account_name>.blob.core.windows.net

Private connectivity storage account URI:

<storage_account_name>.privatelink.blob.core.windows.net

After you configure a private endpoint connection for your account’s internal stage, Microsoft Azure automatically creates a CNAME record in the public DNS service that points the storage account host to its Azure Private Link counterpart. This counterpart is .privatelink.blob.core.windows.net.

优势

实施专用连接以访问 Snowflake 内部暂存区具有以下优势:

  • 内部暂存区数据不会遍历公共互联网。
  • 在 Azure VNet 外部运行的客户端和 SaaS 应用程序(如 Microsoft PowerBI)可以安全地连接到 Snowflake。
  • 管理员无需修改防火墙设置即可访问内部暂存区数据。
  • 管理员可以对用户连接到存储账户的方式实施一致的安全性和监控。

限制

Microsoft Azure 定义了 Azure Private Endpoint 如何与 Snowflake 进行交互:

配置 Azure Private Endpoint 以访问 Snowflake 内部暂存区

若要将 Azure Private Endpoint 配置为访问 Snowflake 内部暂存区,您必须获得组织中以下三个角色的支持:

  1. Snowflake 账户管理员(即具有 Snowflake ACCOUNTADMIN 系统角色的用户)。
  2. Microsoft Azure 管理员。
  3. 网络管理员。

根据组织的不同,可能需要与多个人员或团队协调配置工作,以实施以下配置步骤。

完成以下步骤,通过 Azure Private Endpoint 配置和实现对 Snowflake 内部暂存区的安全访问:

  1. Verify that your Azure subscription is registered with the Azure Storage resource manager. This step allows you to connect to the internal stage from a private endpoint.

  2. As a Snowflake account administrator, run the following commands in your Snowflake account and record the ResourceID of the internal stage storage account defined by the privatelink_internal_stage key. For more information, see ENABLE_INTERNAL_STAGES_PRIVATELINK and SYSTEM$GET_PRIVATELINK_CONFIG.

    USE ROLE ACCOUNTADMIN;
    ALTER ACCOUNT SET ENABLE_INTERNAL_STAGES_PRIVATELINK = true;
    SELECT KEY, VALUE FROM TABLE(flatten(input=>parse_json(system$get_privatelink_config())));
  3. 作为 Azure 管理员,通过 Azure 门户创建 Azure Private Endpoint。

    View the private endpoint properties and record the resource ID value. You will provide this value as the privateEndpointResourceID function argument in the next step.

    Verify that the Target sub-resource value is set to blob.

    For more information, see the Microsoft Azure Private Link documentation (https://docs.microsoft.com/en-us/azure/private-link/).

    Important

    Before you proceed with the next step to authorize the private endpoint, you should be aware of the Microsoft Azure DNS behavior when a private endpoint is authorized on a storage location for the very first time.

    When the first private endpoint is connected and authorized, Azure automatically creates a CNAME record in its public DNS for storage-account-name.privatelink.blob.core.windows.net.

    Under normal circumstances, this DNS update should not affect existing public connectivity to the storage account. However, if your environment already has private DNS zones configured for .privatelink.blob.core.windows.net, this DNS update can lead to unintended behavior. Specifically, existing storage clients attempting to access the public endpoint storage-account-name.blob.core.windows.net may fail DNS resolution or be unable to reach the storage account using public IP.

    To avoid this issue, Microsoft recommends enabling the Fallback to Internet option in the private DNS zone configuration before authorizing the first private endpoint. This guidance also appears as a cautionary note in the Microsoft Azure DNS zone configuration documentation (https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns#azure-services-dns-zone-configuration).

  4. As the Snowflake administrator, call the SYSTEM$AUTHORIZE_STAGE_PRIVATELINK_ACCESS function using the privateEndpointResourceID value as the function argument. This step authorizes access to the Snowflake internal stage through the private endpoint.

    USE ROLE ACCOUNTADMIN;
    SELECT SYSTEM$AUTHORIZE_STAGE_PRIVATELINK_ACCESS('<privateEndpointResourceID>');

    To verify which private endpoints are authorized for the internal stage of your account and to check the approval status of each endpoint connection, call the SYSTEM$GET_STAGE_PRIVATELINK_AUTHORIZED_ENDPOINTS function.

    If necessary, complete these steps to revoke access to the internal stage.

  5. Involve your network administrator to update the DNS settings in a private DNS zone. The settings must resolve the privatelink blob URL <storage_account_name>.privatelink.blob.core.windows.net to the private IP address(es) of the Azure private endpoint that connects to your storage account internal stage.

    For more information, see Azure Private Endpoint DNS configuration (https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns).

    Tip

    • Use a separate Snowflake account for testing, and configure a private DNS zone in a test VNet to test the feature so that the testing is isolated and does not impact your other workloads.
    • If using a separate Snowflake account is not possible, use a test user to access Snowflake from a test VPC where the DNS changes are made.
    • To test from on-premises applications, use DNS forwarding to forward requests to the Azure private DNS in the VNet where the DNS settings are made. Run the following command from the client machine to verify that the IP address returned is the private IP address for the storage account:
      dig <storage_account_name>.blob.core.windows.net

通过 Azure 专用链接将 Azure Private Endpoint 配置为访问内部暂存区后,您可以选择阻止从公共 IP 地址到内部暂存区的请求。阻止公共访问后,所有流量都必须通过 Azure Private Endpoint。

Controlling public access to an Azure internal stage differs from controlling public access to the Snowflake service. You use the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function, not a network policy, to block requests to the internal stage. Unlike network policies, this function can’t block some public IP addresses while allowing others. Calling the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function blocks all public IP addresses.

Important

Confirm that traffic using private connectivity is successfully reaching the internal stage before blocking public access. Blocking public access without configuring private connectivity can cause unintended disruptions, including interference with managed services like Azure Data Factory.

The SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function enforces its restrictions by altering the Networking settings of the Azure storage account where the internal stage is located. These Azure settings are commonly referred to as the “storage account firewall settings”. Calling this Snowflake system function does the following actions in Azure:

  • Sets the Public network access field to Enabled from selected virtual networks and IP addresses.
  • Adds Snowflake VNet subnet ids to the Virtual Networks section.
  • Clears all IP addresses from the Firewall section.

要阻止从公共 IP 地址到内部暂存区的所有流量,请调用以下函数:

SELECT SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS();

该函数可能需要几分钟才能完成。

Blocking public access with IP allowlist exceptions

The SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION function extends the set of functions for blocking public access to internal stages. While the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function blocks all public IP addresses, SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION lets you block public access while maintaining an allowlist of IP addresses or CIDR blocks that are permitted to reach an internal stage location on Microsoft Azure.

Note

This feature is not supported on Amazon Web Services or Google Cloud.

To block public access to internal stages on Microsoft Azure while allowing specific IP addresses or CIDR blocks, take the following steps:

  1. Define IP allowlist exceptions
  2. Verify function status
  3. Test stage access with a pre-signed URL

Define IP allowlist exceptions

To create or modify an allowlist that defines which IP addresses can access an internal stage location on Microsoft Azure, call the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION function and provide a comma-separated list of IP addresses or CIDR ranges as function arguments. For example:

USE ROLE ACCOUNTADMIN;

SELECT SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION('1.2.3.4/24, 100.0.0.1, 101.0.0.0/31');

Note

You can also call this function to replace an existing allowlist with a different one.

Verify function status

Check that the feature is active and view the IP allowlist by calling the SYSTEM$INTERNAL_STAGES_PUBLIC_ACCESS_STATUS function:

SELECT SYSTEM$INTERNAL_STAGES_PUBLIC_ACCESS_STATUS();

Test stage access with a pre-signed URL

To confirm the allowlist is working correctly:

  1. Ensure the ENABLE_INTERNAL_STAGES_PRIVATELINK parameter is set to TRUE.
  2. Create an internal stage and upload a sample file for testing.
  3. Generate a pre-signed URL for that file and test access from different IP addresses. Only requests originating from allowlisted IPs should be allowed.
    SELECT GET_PRESIGNED_URL(@my_stage, 'data/sample.csv');

Examples

Block public access while allowing specific IP addresses and CIDR ranges:

USE ROLE ACCOUNTADMIN;

SELECT SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION('100.0.0.1', '1.2.3.0/24', '101.0.0.0/31');
Public Access to internal stages is blocked. Private link is required to connect to internal stages of this account. Exceptions: 100.0.0.1, 1.2.3.0/24, 101.0.0.0/31

Replace the existing allowlist with a new set of exceptions:

SELECT SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS_WITH_EXCEPTION('200.0.0.1', '10.0.0.0/16');
Public Access to internal stages is blocked. Private link is required to connect to internal stages of this account. Exceptions: 200.0.0.1, 10.0.0.0/16

确保阻止公共访问

To determine whether public IP addresses are able to access an internal stage, call the SYSTEM$INTERNAL_STAGES_PUBLIC_ACCESS_STATUS function.

If the Azure settings are currently blocking all public traffic, the function returns Public Access to internal stages is blocked. This verifies that the settings have not been changed since the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function was called.

If at least some public IP addresses can access the internal stage, the function returns Public Access to internal stages is unblocked.

解除对公共访问的阻止

To allow public access to an internal stage that was previously blocked, call the SYSTEM$UNBLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function.

Calling this function alters the Networking settings of the Azure storage account where the internal stage is located. It sets the Azure Public network access field to Enabled from all networks.

撤消 Azure Private Endpoint 以访问 Snowflake 内部暂存区

要通过 Microsoft Azure Private Endpoint 撤消对 Snowflake 内部暂存区的访问权限,请完成以下步骤:

  1. As a Snowflake administrator, confirm that the ENABLE_INTERNAL_STAGES_PRIVATELINK parameter is set to TRUE. For example:

    USE ROLE ACCOUNTADMIN;
    SHOW PARAMETERS LIKE 'enable_internal_stages_privatelink' IN ACCOUNT;
  2. As a Snowflake administrator, call the SYSTEM$REVOKE_STAGE_PRIVATELINK_ACCESS function to revoke access to the private endpoint, and use the same privateEndpointResourceID value that was used to originally authorize access to the private endpoint.

    USE ROLE ACCOUNTADMIN;
    SELECT SYSTEM$REVOKE_STAGE_PRIVATELINK_ACCESS('<privateEndpointResourceID>');
  3. 作为 Azure 管理员,请通过 Azure 门户删除专用端点。

  4. 作为网络管理员,删除用于解析存储账户 URLs 的 DNS 和别名记录。

At this point, the access to the private endpoint is revoked. The query result from calling the SYSTEM$GET_PRIVATELINK_CONFIG function shouldn’t return the privatelink_internal_stage key and its value.

故障排除

如本主题中所述,如果与 Snowflake 暂存区建立了 Azure Private Endpoint 连接,则通过公共网络访问暂存区并使用专用 DNS 服务解析服务主机名的 Azure 应用程序无法访问 Snowflake 暂存区。

如果任何应用程序为同一域配置了专用 DNS 区域,则 Microsoft Azure 会尝试通过查询专用 DNS 服务来解析存储账户主机。如果在专用 DNS 服务中找不到存储账户的条目,则会发生连接错误。

要解决此问题,请使用以下两个选项之一:

  1. 从应用程序移除专用 DNS 区域或取消其与应用程序的关联。
  2. Create a CNAME record for the storage account private hostname — that is, <storage_account_name>.privatelink.blob.core.windows.net — in the private DNS service and point it to the hostname specified by the output of this command:
    dig CNAME <storage_account_name>.privatelink.blob.core.windows.net