跨区域和云平台安全共享数据

This topic provides instructions on using replication to allow data providers to securely share data with data consumers across different regions and cloud platforms.

Note

If you use listings to share data with specific consumer accounts, or you use the Snowflake Marketplace, you can use Cross-Cloud Auto-fulfillment to automatically fulfill your data product to other regions.

以下任意云平台上托管的 Snowflake 账户均支持跨区域数据共享:

  • Amazon Web Services (AWS)
  • Google Cloud Platform (GCP)
  • Microsoft Azure (Azure)

Important

如果将主数据库复制到不同于源 Snowflake 账户的地理区域或国家/地区,则应确认您的组织在传输或托管数据的位置方面不受任何法律或法规限制。

数据共享注意事项

Diagram of data replication and sharing between regions and clouds

与不同区域和云平台的数据使用者共享数据

Snowflake 数据提供商只需执行几个简单的步骤,即可与不同区域的数据使用者共享数据。

第 1 步:设置数据复制

Note

Before configuring data replication, you must create an account in a region where you wish to share data and link it to your local account. For more information, see Working with organizations and accounts.

数据复制的设置涉及以下任务:

  1. 为账户启用复制功能。

    An organization administrator must enable replication for the source account that contains the data to share and the target accounts in regions where you want to share data with consumers. For instructions on enabling replication, see Prerequisite: Enable replication for accounts in the organization.

  2. 创建复制组并添加数据库和共享。

  3. 将具有数据库和共享的组复制到要与使用者共享数据的区域。

第 2 步:与数据使用者共享数据

要与同一区域中的数据使用者共享数据,需要将一个或多个使用者账户添加到从源账户复制的辅助共享。

For detailed instructions, see Getting Started with Secure Data Sharing.

示例 1:共享数据

数据提供商 Acme 想与不同区域的数据使用者共享数据。

Diagram of a basic example on how to share data between regions

在源账户中执行

要创建复制组,在其中包含要复制到另一个区域的数据库和共享,请执行以下 SQL 语句。

Note

If you have previously enabled replication for an individual database, you must disable database replication for the database before you add it to a replication group. For details, see Transitioning from database replication to group-based replication.

Create a replication group my_rg that includes database db1 and share share1 to replicate to the account account_2 in the acme org.

USE ROLE ACCOUNTADMIN;

CREATE REPLICATION GROUP my_rg
  OBJECT_TYPES = databases, shares
  ALLOWED_DATABASES = db1
  ALLOWED_SHARES = share1
  ALLOWED_ACCOUNTS = acme.account_2;

在目标账户上执行

在另一个区域中的目标账户内执行以下 SQL 语句。您添加到共享的任何账户都应为目标账户所在区域的本地账户。更改共享以设置账户(目标)列表后,您添加的账户不会在下次刷新时被覆盖。

  1. Create a secondary replication group in account_2:

    USE ROLE ACCOUNTADMIN;
    
    CREATE REPLICATION GROUP my_rg
      AS REPLICA OF acme.account1.my_rg;
  2. Manually refresh the replication group to replicate the databases and shares to account_2:

    ALTER REPLICATION GROUP my_rg REFRESH;
  3. Add one or more consumer accounts to share1:

    ALTER SHARE share1 ADD ACCOUNTS = consumer_org.consumer_account_name;

You can automate refresh operations by setting the REPLICATION_SCHEDULE parameter for the primary replication group using the ALTER REPLICATION GROUP command in the source account. For more information,see Replication schedule.

示例 2:共享数据库内的一个数据子集

数据提供商 Acme 想与不同区域的数据使用者共享数据子集。为降低复制成本,他们希望仅复制主表中的相关行。由于复制在数据库级别完成,此示例描述了 Acme 如何使用流和任务,将所需行从主数据库复制到新数据库,创建共享并授予对视图的权限,以及将复制组中的两者复制到不同区域中的账户,以供使用者访问。在此情景中,新数据库和共享被指定为数据复制的主要对象。

Diagram of an advanced example on how to share data between regions

在源账户中执行

使用以下 SQL 命令,在源账户中新建数据库并启用复制。

Note

If you have previously enabled replication for an individual database, you must disable database replication for the database before you add it to a replication group. For details, see Transitioning from database replication to group-based replication.

  1. In your local account, create a database db1 with a subset of data from the database with the source data:

    USE ROLE ACCOUNTADMIN;
    
    CREATE DATABASE db1;
    
    CREATE SCHEMA db1.sch;
    
    CREATE TABLE db1.sch.table_b AS
      SELECT customerid, user_order_count, total_spent
      FROM source_db.sch.table_a
      WHERE REGION='azure_eastus2';
  2. 使用要共享的数据创建安全视图:

    CREATE SECURE VIEW db1.sch.view1 AS
      SELECT customerid, user_order_count, total_spent
      FROM db1.sch.table_b;
  3. 创建流,以记录对源表所做的更改:

    CREATE STREAM mystream ON TABLE source_db.sch.table_a APPEND_ONLY = TRUE;
  4. Create a task to insert data into the table in db1 with changes from the source data:

    CREATE TASK mytask1
      WAREHOUSE = mywh
      SCHEDULE = '5 minute'
    WHEN
      SYSTEM$STREAM_HAS_DATA('mystream')
    AS
      INSERT INTO table_b(CUSTOMERID, USER_ORDER_COUNT, TOTAL_SPENT)
     SELECT customerid, user_order_count, total_spent
     FROM mystream
     WHERE region='azure_eastus2'
     AND METADATA$ACTION = 'INSERT';
  5. 启动数据更新任务:

    ALTER TASK mytask1 RESUME;
  6. 创建共享并向此共享授予权限:

    CREATE SHARE share1;
    
    GRANT USAGE ON DATABASE db1 TO SHARE share1;
    GRANT USAGE ON SCHEMA db1.sch TO SHARE share1;
    GRANT SELECT ON VIEW db1.sch.view1 TO SHARE share1;
  7. 创建包含数据库和共享的主复制组:

    CREATE REPLICATION GROUP my_rg
      OBJECT_TYPES = DATABASES, SHARES
      ALLOWED_DATABASES = db1
      ALLOWED_SHARES = share1
      ALLOWED_ACCOUNTS = acme_org.account_2;

在目标账户上执行

从另一个区域中的目标账户执行以下 SQL 命令。

  1. 创建辅助复制组,从源账户复制数据库和共享:

    USE ROLE ACCOUNTADMIN;
    
    CREATE REPLICATION GROUP my_rg
      AS REPLICA OF acme_org.account_1.my_rg;
  2. 手动刷新组,将对象复制到当前账户:

    ALTER REPLICATION GROUP my_rg REFRESH;
  3. 将一个或多个使用者账户添加到共享:

    ALTER SHARE share1 ADD ACCOUNTS = consumer_org.consumer_account_name;

You can automate refresh operations by setting the REPLICATION_SCHEDULE parameter for the primary replication group using the ALTER REPLICATION GROUP command in the source account. For more information,see Replication schedule.

示例 3:共享多个数据库中的数据

A data provider, Acme, wants to share data from multiple databases with data consumers in a different region. They create a secure view and share (for instructions, see Share data from multiple databases), then replicate all the databases and share in a replication group to replicate data to accounts in other regions.

在源账户中执行

Create a replication group my_rg that includes the databases and share from Example 1: Create and share a secure view in an existing database to replicate to account_2 in the acme org:

CREATE REPLICATION GROUP my_rg
  OBJECT_TYPES = databases, shares
  ALLOWED_DATABASES = database1, database2, database3
  ALLOWED_SHARES = share1
  ALLOWED_ACCOUNTS = acme.account_2;

在目标账户上执行

从另一个区域中的目标账户执行以下 SQL 命令。

  1. 创建辅助复制组,从源账户复制数据库和共享:

    USE ROLE ACCOUNTADMIN;
    
    CREATE REPLICATION GROUP my_rg
      AS REPLICA OF acme_org.account_1.my_rg;
  2. 手动刷新组,将对象复制到当前账户:

    ALTER REPLICATION GROUP my_rg REFRESH;
  3. 将一个或多个使用者账户添加到共享:

    ALTER SHARE share1 ADD ACCOUNTS = consumer_org.consumer_account_name;

You can automate refresh operations by setting the REPLICATION_SCHEDULE parameter for the primary replication group using the ALTER REPLICATION GROUP command in the source account. For more information,see Replication schedule.