Azure private endpoints for internal stages¶
This topic provide concepts as well as detailed instructions for connecting to Snowflake internal stages through Microsoft Azure Private Endpoints.
Overview¶
Azure Private Endpoints (https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-overview) and Azure Private Link (https://docs.microsoft.com/en-us/azure/private-link/private-link-overview) can be combined to provide secure connectivity to Snowflake internal stages. This setup ensures that data loading and data unloading operations to Snowflake internal stages use the Azure internal network and do not take place over the public Internet.
Prior to Microsoft supporting Private Endpoints for internal stage access, it was necessary to create a proxy farm within the Azure VNet to facilitate secure access to Snowflake internal stages. With the added support of Private Endpoints for Snowflake internal stages, users and client applications can now access Snowflake internal stages over the private Azure network. The following diagram summarizes this new support:
Note the following regarding the numbers in the BEFORE diagram:
Users have two options to connect to a Snowflake internal stage:
Option A allows an on-premises connection directly to the internal stage as shown by the number 1.
Option B allows a connection to the internal stage through a proxy farm as shown by the numbers 2 and 3.
If using the proxy farm, users can also connect to Snowflake directly as denoted by the number 4.
Note the following regarding the numbers in the AFTER diagram:
For clarity, the diagram shows a single Private Endpoint from one Azure VNet pointing to a single Snowflake internal stage (6 and 7).
Note that it is possible to configure multiple Private Endpoints, each within a different VNet, that point to the same Snowflake internal stage.
The updates in this feature remove the need to connect to Snowflake or a Snowflake internal stage through a proxy farm.
An on-premises user can connect to Snowflake directly as shown in number 5.
To connect to a Snowflake internal stage, on-premises user connects to a Private Endpoint, number 6, and then uses Azure Private Link to connect to the Snowflake internal stage as shown in number 7.
In Azure, each Snowflake account has a dedicated storage account to use as an internal stage. The storage account URIs are different
depending on whether the connection to the storage account uses private connectivity (i.e. Azure Private Link). The private connectivity
URL includes a privatelink
segment in the URL.
- Public storage account URI:
<storage_account_name>.blob.core.windows.net
- Private connectivity storage account URI:
<storage_account_name>.privatelink.blob.core.windows.net
Benefits¶
Implementing Private Endpoints to access Snowflake internal stages provides the following advantages:
Internal stage data does not traverse the public Internet.
Client and SaaS applications, such as Microsoft PowerBI, that run outside of the Azure VNet can connect to Snowflake securely.
Administrators are not required to modify firewall settings to access internal stage data.
Administrators can implement consistent security and monitoring regarding how users connect to storage accounts.
Limitations¶
Microsoft Azure defines how a Private Endpoint can interact with Snowflake:
A single Private Endpoint can communicate to a single Snowflake Service Endpoint. You can have multiple one-to-one configurations that connect to the same Snowflake internal stage.
The maximum number of private endpoints in your storage account that can connect to a Snowflake internal stage is fixed. For details, see Standard storage account limits (https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits#standard-storage-account-limits).
Configuring private endpoints to access Snowflake internal stages¶
To configure Private Endpoints to access Snowflake internal stages, it is necessary to have support from the following three roles in your organization:
The Snowflake account administrator (i.e. a user with the Snowflake ACCOUNTADMIN system role).
The Microsoft Azure administrator.
The network administrator.
Depending on the organization, it may be necessary to coordinate the configuration efforts with more than one person or team to implement the following configuration steps.
Complete the following steps to configure and implement secure access to Snowflake internal stages through Azure Private Endpoints:
Verify that your Azure subscription is registered with the Azure Storage resource manager. This step allows you to connect to the internal stage from a private endpoint.
As a Snowflake account administrator, execute the following statements in your Snowflake account and record the
ResourceID
of the internal stage storage account defined by theprivatelink_internal_stage
key. For more information, see ENABLE_INTERNAL_STAGES_PRIVATELINK and SYSTEM$GET_PRIVATELINK_CONFIG.use role accountadmin; alter account set ENABLE_INTERNAL_STAGES_PRIVATELINK = true; select key, value from table(flatten(input=>parse_json(system$get_privatelink_config())));
As the Azure administrator, create a Private Endpoint through the Azure portal.
View the Private Endpoint properties and record the resource ID value. This value will be the
privateEndpointResourceID
value in the next step.Verify that the Target sub-resource value is set to
blob
.For more information, see the Microsoft Azure Private Link documentation (https://docs.microsoft.com/en-us/azure/private-link/).
As the Snowflake administrator, call the SYSTEM$AUTHORIZE_STAGE_PRIVATELINK_ACCESS function using the
privateEndpointResourceID
value as the function argument. This step authorizes access to the Snowflake internal stage through the Private Endpoint.use role accountadmin; select system$authorize_stage_privatelink_access('<privateEndpointResourceID>');
If necessary, complete these steps to revoke access to the internal stage.
As the network administrator, update the DNS settings to resolve the URLs as follows:
<storage_account_name>.blob.core.windows.net
to<storage_account_name>.privatelink.blob.core.windows.net
When using a private DNS zone in an Azure VNet, create the alias record for
<storage_account_name>.privatelink.blob.core.windows.net
.For more information, see Azure Private Endpoint DNS configuration (https://docs.microsoft.com/en-us/azure/private-link/private-endpoint-dns).
Tip
Use a separate Snowflake account for testing, and configure a private DNS zone in a test VNet to test the feature so that the testing is isolated and does not impact your other workloads.
If using a separate Snowflake account is not possible, use a test user to access Snowflake from a test VPC where the DNS changes are made.
To test from on-premises applications, use DNS forwarding to forward requests to the Azure private DNS in the VNet where the DNS settings are made. Execute the following command from the client machine to verify that the IP address returned is the private IP address for the storage account:
dig <storage_account_name>.blob.core.windows.net
Blocking public access — Recommended¶
Once you have configured Private Endpoints to access the internal stage via Azure Private Link, you can optionally block requests from public IP addresses to the internal stage. Once public access is blocked, all traffic must be through the Private Endpoint.
Controlling public access to an Azure internal stage is different from controlling public access to the Snowflake service. You use the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function, not a network policy, to block requests to the internal stage. Unlike network policies, this function cannot block some public IP addresses while allowing others. When you call SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS, all public IP addresses are blocked.
Important
Confirm that traffic via private connectivity is successfully reaching the internal stage before blocking public access. Blocking public access without configuring private connectivity can cause unintended disruptions, including interference with managed services like Azure Data Factory.
The SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function enforces its restrictions by altering the Networking settings of the Azure storage account where the internal stage is located. These Azure settings are commonly referred to as the “storage account firewall settings”. Executing the Snowflake function does the following in Azure:
Sets the Public network access field to Enabled from selected virtual networks and IP addresses.
Adds Snowflake VNet subnet ids to the Virtual Networks section.
Clears all IP addresses from the Firewall section.
To block all traffic from public IP addresses to the internal stage, execute:
SELECT SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS();
The function can take a few minutes to finish executing.
Ensuring public access is blocked¶
You can determine whether public IP addresses are able to access an internal stage by executing the SYSTEM$INTERNAL_STAGES_PUBLIC_ACCESS_STATUS function.
If the Azure settings are currently blocking all public traffic, the function returns Public Access to internal stages is blocked
.
This verifies that the settings have not been changed since the SYSTEM$BLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function was executed.
If at least some public IP addresses can access the internal stage, the function returns
Public Access to internal stages is unblocked
.
Unblocking public Access¶
You can execute the SYSTEM$UNBLOCK_INTERNAL_STAGES_PUBLIC_ACCESS function to allow public access to an internal stage that was previously blocked.
Executing the function alters the Networking settings of the Azure storage account where the internal stage is located. It sets the Azure Public network access field to Enabled from all networks.
Revoking Private Endpoints to access Snowflake internal stages¶
Complete the following steps to revoke access to Snowflake internal stages through Microsoft Azure Private Endpoints:
As a Snowflake administrator, set the ENABLE_INTERNAL_STAGES_PRIVATELINK parameter to
FALSE
and call the SYSTEM$REVOKE_STAGE_PRIVATELINK_ACCESS function to revoke access to the Private Endpoint, using the sameprivateEndpointResourceID
value that was used to originally authorize access to the Private Endpoint.use role accountadmin; alter account set enable_internal_stages_privatelink = false; select system$revoke_stage_privatelink_access('<privateEndpointResourceID>');
As an Azure administrator, delete the Private Endpoint through the Azure portal.
As a network administrator, remove the DNS and alias records that were used to resolve the storage account URLs.
At this point, the access to the Private Endpoint is now revoked and the query result from calling the
SYSTEM$GET_PRIVATELINK_CONFIG function should not return the privatelink_internal_stage
key and its
value.
Troubleshooting¶
Azure applications that access Snowflake stages over the public Internet and also use a private DNS service to resolve service hostnames cannot access Snowflake stages if a private endpoint connection is established to the stage as described in this topic.
Once a private endpoint connection is created, Microsoft Azure automatically creates a CNAME record in the public DNS service that points
the storage account host to its Azure Private Link counterpart (i.e. .privatelink.blob.core.windows.net
). If any application has
configured a private DNS region for the same domain, then Microsoft Azure tries to resolve the storage account host by querying the private
DNS service. If the entry for the storage account is not found in the private DNS service, a connection error occurs.
There are two options to address this issue:
Remove or dissociate the private DNS region from the application.
Create a CNAME record for the storage account private hostname (i.e.
<storage_account_name>.privatelink.blob.core.windows.net
) in the private DNS service and point it to the hostname specified by the output of this command:dig CNAME <storage_account_name>.privatelink.blob.core.windows.net