对象依赖关系

本主题提供有关对象依赖关系的概念以及与 Account Usage 视图 OBJECT_DEPENDENCIES 相关的信息。

什么是对象依赖关系?

An object dependency means that in order to operate on an object, the object that is being operated on must reference metadata for itself or reference metadata for at least one other object. Snowflake tracks object dependencies in the Account Usage view OBJECT_DEPENDENCIES.

Snowflake 支持本地 Snowflak 账户中的对象依赖关系以及与数据共享相关的某些依赖关系,例如,通过提供商共享获取的表中,在使用者账户中创建视图。通过共享对象的依赖关系,数据主管能够确保更高的数据完整性,更全面地遵循每个监管标准,并生成更详细的影响分析。

Snowflake supports the following dependency types that can trigger a dependency: the object name value, the object ID value, and the combination of the object name value with the object ID value.

BY_NAME:

A BY_NAME dependency occurs when the SQL statement specifies the name value of the object itself (e.g. a CREATE or ALTER command), or when an object calls the name value of another object (e.g. using a FROM clause) to complete a SQL operation.

例如,请注意以下语句:

create view myview as select * from mytable;

The table name value mytable is metadata for the table. The view named myview is dependent on the table named mytable; the table must exist to create the view.

Snowflake refers to the view named myview as the referencing object and the table mytable as the referenced object.

BY_ID:

A BY_ID dependency occurs when an object stores the object ID value of another object. One example of an ID dependency is an external stage storing the OBJECT_ID value of a storage integration. Currently, the storage integration object ID value is only accessible to Snowflake and is not made visible through any customer-facing SQL operation.

create stage my_ext_stage
  url='s3china://load/files/'
  storage_integration = myint;

Snowflake refers to the external stage named my_ext_stage as the referencing object and the storage integration named myint as the referenced object.

BY_NAME_AND_ID:

Some Snowflake objects (e.g. materialized views) are dependent on both the object name value and the object ID value. These objects are often the result of a CREATE OR REPLACE statement to replace an existing object or an ALTER statement to rename an object.

For more information, see the Usage notes section of the Account Usage OBJECT_DEPENDENCIES view.

支持的对象依赖关系

Snowflake 支持引用的对象和受引用的对象,如下所示:

Referencing ObjectReferenced ObjectDependency Type
View, Secure View, dynamic table, SQL UDF, SQL UDTF, and other objects referenced by name

视图

安全视图

物化视图

动态表

UDF(各种)

UDTF

以及按名称引用的其他对象

BY_NAME

外部暂存区

存储集成

表、视图、安全视图

BY_ID
External tableStageBY_ID
Materialized ViewTable, External TableBY_NAME_AND_ID

请注意,Snowflake 在数据共享上下文中仅支持以下对象:

Referencing objectReferenced objectDependency type
View, dynamic table, SQL UDF, SQL UDTF

安全视图

安全物化视图

动态表

安全 UDF 和安全 UDTF

BY_NAME
Materialized viewTableBY_NAME_AND_ID

For more information, see the Usage Notes section of the OBJECT_DEPENDENCIES view.

优势

识别对象依赖关系,可以深入了解数据跟踪用例,如下所示:

Impact analysis:

通过了解对象依赖关系,数据管理员可以识别引用的对象和受引用的对象之间的关系,以确保受引用的对象的更新,不会对引用的对象的用户产生不利影响。

例如,表所有者计划向表中添加列。根据表名查询 OBJECT_DEPENDENCIES 视图,可返回所有会受影响的对象(例如视图)。

然后,数据管理员可以协调行动计划,以确保表和视图更新的时间不会导致任何中断的查询,这将对用户查询从表创建的视图产生不利影响。

Compliance:

对象依赖关系可帮助合规官识别敏感数据源(即受引用的对象)和数据目标(即引用的对象)之间的关系。然后,合规官可以根据合规性要求(例如 GDPR),决定如何更好地更新受引用的对象和引用的对象。

Data integrity:

对象依赖关系可帮助主要数据专业人员(如分析师、科学家、合规官和其他业务用户)确信数据来自可信赖的源。

限制

In addition to the view usage notes, note the following limitations when querying the OBJECT_DEPENDENCIES view:

Session parameters:

Snowflake cannot accurately compute the dependencies of objects that include session parameters in their definitions because session parameters can take on different values depending on the context.

Snowflake 建议不要在视图和函数定义中使用会话变量。

Snowflake implementations:

此视图不会获取 Snowflake 实现所需的依赖关系。例如,该视图不会记录通过另一个表的克隆创建新表所需的依赖关系。

Object resolution:

如果视图定义使用函数调用对象来创建视图,或者如果在其他函数或视图内调用对象,则 Snowflake 不会记录对象依赖关系。例如:

create or replace view v_on_stage_function
as
select *
from T1
where get_presigned_url(@stage1, 'data_0.csv.gz')
is not null;

In this example, the function get_presigned_url calls the stage stage1. Snowflake does not record that the view named v_on_stage_function depends on the stage named stage1.

Broken dependencies:

If the dependency type value is BY_NAME_AND_ID and an object dependency changes due to a CREATE OR REPLACE or ALTER operation on an object, Snowflake only records the object dependency prior to these operations.

Snowflake 不会在执行这些操作后在视图查询结果中记录对象依赖关系,因为结果是损坏的引用。

具有 Snowflake 功能和服务的对象依赖关系

External objects:

Snowflake 仅跟踪 Snowflake 对象的对象依赖关系。例如,如果 Snowflake 对象依赖于 Amazon S3 桶,则此视图不会记录对桶的依赖关系,因为该桶是 Amazon 对象,而不是 Snowflake 对象。

Replication:

虽然辅助对象依赖于主对象,但此视图不会记录由于复制操作而导致的依赖关系。

Data sharing:

对于提供商账户,此视图不允许数据共享提供商账户确定数据共享使用者账户中的依赖对象。例如,数据共享提供商创建视图并共享该视图。数据共享提供商无法使用此视图来确定使用者账户中通过共享视图创建的任何对象(例如新表或视图)。

对于使用者账户,此视图不允许数据共享使用者账户确定数据共享提供商账户中的依赖对象。例如,如果数据共享使用者账户使用由数据共享提供商账户提供的 UDF,则数据共享使用者无法使用此视图来标识共享的 UDF 所依赖的对象。

For more information, refer to the Usage notes.

查询 OBJECT_DEPENDENCIES 视图

以下示例涵盖以下用例:

  1. 显示依赖于外部表的对象。
  2. 影响分析:查找表所引用的对象。
  3. GDPR:查找给定视图的数据源。
  4. 数据共享。

显示依赖于外部表的对象

Create a materialized view named sales_view from the external table named sales_staging_table:

CREATE OR REPLACE MATERIALIZED VIEW sales_view AS SELECT * FROM sales_staging_table;

Query the OBJECT_DEPENDENCIES view in the Account Usage schema of the shared SNOWFLAKE database. Note that the materialized view is the referencing_object_name and the external table is the referenced_object_domain:

SELECT referencing_object_name, referencing_object_domain, referenced_object_name, referenced_object_domain
FROM snowflake.account_usage.object_dependencies
WHERE referenced_object_name = 'SALES_STAGING_TABLE' and referenced_object_domain = 'EXTERNAL TABLE';
+-------------------------+---------------------------+------------------------+--------------------------+
| REFERENCING_OBJECT_NAME | REFERENCING_OBJECT_DOMAIN | REFERENCED_OBJECT_NAME | REFERENCED_OBJECT_DOMAIN |
+-------------------------+---------------------------+------------------------+--------------------------+
| SALES_VIEW              | MATERIALIZED VIEW         | SALES_STAGING_TABLE    | EXTERNAL TABLE           |
+-------------------------+---------------------------+------------------------+--------------------------+

影响分析:查找表所引用的对象

Consider a base table named SALES_NA, where NA indicates North America, US indicates United States, and CAL indicates California, with a series of nested views:

  • (table) SALES_NA » (view) NORTH_AMERICA_SALES » (view) US_SALES
  • (table) SALES_NA » (view) NORTH_AMERICA_SALES » (view) CAL_SALES

要创建表和嵌套视图,请执行以下命令:

CREATE TABLE sales_na(product string);
CREATE OR REPLACE VIEW north_america_sales AS SELECT * FROM sales_na;
CREATE VIEW us_sales AS SELECT * FROM north_america_sales;
CREATE VIEW cal_sales AS SELECT * FROM north_america_sales;

Similarly, consider the relationship of the base table SALES_NA to its nested views, and consider the base table SALES_UK, where UK indicates the United Kingdom, to its nested view.

Note that two different views serve as source objects to derive the view named GLOBAL_SALES:

  • (table) SALES_NA » (view) NORTH_AMERICA_SALES » (view) GLOBAL_SALES
  • (table) SALES_UK » (view) GLOBAL_SALES

要创建这些嵌套视图,请执行以下命令:

CREATE TABLE sales_uk (product string);
CREATE VIEW global_sales AS SELECT * FROM sales_uk UNION ALL SELECT * FROM north_america_sales;

Query the OBJECT_DEPENDENCIES view in the Account Usage schema of the shared SNOWFLAKE database to determine the object references for the table SALES_NA. Note the fourth row in the query result, which specifies the table SALES_NA but does not reference the table SALES_UK:

WITH RECURSIVE referenced_cte
(object_name_path, referenced_object_name, referenced_object_domain, referencing_object_domain, referencing_object_name, referenced_object_id, referencing_object_id)
    AS
      (
        SELECT referenced_object_name || '-->' || referencing_object_name as object_name_path,
               referenced_object_name, referenced_object_domain, referencing_object_domain, referencing_object_name, referenced_object_id, referencing_object_id
          FROM snowflake.account_usage.object_dependencies referencing
          WHERE true
            AND referenced_object_name = 'SALES_NA' AND referenced_object_domain='TABLE'

        UNION ALL

        SELECT object_name_path || '-->' || referencing.referencing_object_name,
              referencing.referenced_object_name, referencing.referenced_object_domain, referencing.referencing_object_domain, referencing.referencing_object_name,
              referencing.referenced_object_id, referencing.referencing_object_id
          FROM snowflake.account_usage.object_dependencies referencing JOIN referenced_cte
            ON referencing.referenced_object_id = referenced_cte.referencing_object_id
            AND referencing.referenced_object_domain = referenced_cte.referencing_object_domain
      )

  SELECT object_name_path, referenced_object_name, referenced_object_domain, referencing_object_name, referencing_object_domain
    FROM referenced_cte
;
+-----------------------------------------------+------------------------+--------------------------+-------------------------+---------------------------+
| OBJECT_NAME_PATH                              | REFERENCED_OBJECT_NAME | REFERENCED_OBJECT_DOMAIN | REFERENCING_OBJECT_NAME | REFERENCING_OBJECT_DOMAIN |
+-----------------------------------------------+------------------------+--------------------------+-------------------------+---------------------------+
| SALES_NA-->NORTH_AMERICA_SALES                | SALES_NA               | TABLE                    | NORTH_AMERICA_SALES     | VIEW                      |
| SALES_NA-->NORTH_AMERICA_SALES-->CAL_SALES    | NORTH_AMERICA_SALES    | VIEW                     | CAL_SALES               | VIEW                      |
| SALES_NA-->NORTH_AMERICA_SALES-->US_SALES     | NORTH_AMERICA_SALES    | VIEW                     | US_SALES                | VIEW                      |
| SALES_NA-->NORTH_AMERICA_SALES-->GLOBAL_SALES | NORTH_AMERICA_SALES    | VIEW                     | GLOBAL_SALES            | VIEW                      |
+-----------------------------------------------+------------------------+--------------------------+-------------------------+---------------------------+

GDPR:查找给定视图的数据源

可以从许多不同的源对象创建派生对象(例如视图、CTAS),以提供自定义视图或仪表板。为满足 GDPR 等监管要求,合规官和审计员需要能够将数据从给定对象跟踪到其原始数据源。

For example, the view GLOBAL_SALES is derived from two different dependency paths that point to two different base tables:

  • (table) SALES_NA » (view) NORTH_AMERICA_SALES » (view) GLOBAL_SALES
  • (table) SALES_UK » (view) GLOBAL_SALES

要创建这些嵌套视图,请执行以下命令:

CREATE TABLE sales_na (product string);
CREATE OR REPLACE VIEW north_america_sales AS SELECT * FROM sales_na;
CREATE TABLE sales_uk (product string);
CREATE VIEW global_sales AS SELECT * FROM sales_uk UNION ALL SELECT * FROM north_america_sales;

Query the OBJECT_DEPENDENCIES view in the Account Usage schema of the shared SNOWFLAKE database to find the data source(s) of the view GLOBAL_SALES. Each row in the query result specifies a dependency path to a unique object.

WITH RECURSIVE referenced_cte
(object_name_path, referenced_object_name, referenced_object_domain, referencing_object_domain, referencing_object_name, referenced_object_id, referencing_object_id)
    AS
      (
        SELECT referenced_object_name || '<--' || referencing_object_name AS object_name_path,
               referenced_object_name, referenced_object_domain, referencing_object_domain, referencing_object_name, referenced_object_id, referencing_object_id
          from snowflake.account_usage.object_dependencies referencing
          WHERE true
            AND referencing_object_name = 'GLOBAL_SALES' and referencing_object_domain='VIEW'

        UNION ALL

        SELECT referencing.referenced_object_name || '<--' || object_name_path,
              referencing.referenced_object_name, referencing.referenced_object_domain, referencing.referencing_object_domain, referencing.referencing_object_name,
              referencing.referenced_object_id, referencing.referencing_object_id
          FROM snowflake.account_usage.object_dependencies referencing JOIN referenced_cte
            ON referencing.referencing_object_id = referenced_cte.referenced_object_id
            AND referencing.referencing_object_domain = referenced_cte.referenced_object_domain
      )

  SELECT object_name_path, referencing_object_name, referencing_object_domain, referenced_object_name, referenced_object_domain
    FROM referenced_cte
;
+-----------------------------------------------+-------------------------+---------------------------+------------------------+--------------------------+
| OBJECT_NAME_PATH                              | REFERENCING_OBJECT_NAME | REFERENCING_OBJECT_DOMAIN | REFERENCED_OBJECT_NAME | REFERENCED_OBJECT_DOMAIN |
+-----------------------------------------------+-------------------------+---------------------------+------------------------+--------------------------+
| SALES_UK<--GLOBAL_SALES                       | GLOBAL_SALES            | VIEW                      | SALES_UK               | TABLE                    |
| NORTH_AMERICA_SALES<--GLOBAL_SALES            | GLOBAL_SALES            | VIEW                      | NORTH_AMERICA_SALES    | VIEW                     |
| SALES_NA<--NORTH_AMERICA_SALES<--GLOBAL_SALES | NORTH_AMERICA_SALES     | VIEW                      | SALES_NA               | TABLE                    |
+-----------------------------------------------+-------------------------+---------------------------+------------------------+--------------------------+

数据共享

请注意下表,该表摘自使用者账户中的 OBJECT_DEPENDENCIES 视图,其中:

  • V1 specifies a view that the consumer creates from a shared object.
  • S_V1 specifies a view that the provider shares.
  • S_T1 specifies a table that the provider shares.
REFERENCING_OBJECT_NAMEREFERENCED_OBJECT_NAMEREFERENCED_OBJECT_DOMAINREFERENCED_OBJECT_ID
1V1S_V1TABLENULL
2V1S_T1TABLENULL

根据此表,请注意以下几点:

  • If the provider revokes S_T1 from the share, the consumer continues to see rows that specify S_T1 (row 2) in their local view as long as S_T1 was not renamed prior to the revocation.
  • If the provider drops a table or view in their account, the table or view is no longer included in the share. The local consumer view preserves existing records for the dropped table or view because the table or view was shared prior to the drop operation in the provider account.

使用者无法观察提供商账户中的视图更改。