SYSTEM$CLASSIFY_SCHEMA

Schedules the tables in the specified schema to be classified with the option to specify the number of rows to sample in each table and assign the recommended sensitive data classification system tag to each column in the tables stored in the specified schema.

语法

SYSTEM$CLASSIFY_SCHEMA( '<schema_name>' , <object> )

实参

schema_name

包含待分类表的架构的名称。如果当前会话中未使用数据库和架构,则名称必须完全限定。

名称必须与数据库中存储的名称完全相同。如果名称包含特殊字符、大写字母或空格,则名称必须先用双引号括起来,然后再放在单引号内。

object

Specifies a JSON OBJECT that determines how the classification process works. One of the following:

NULL

Snowflake uses its default configuration based on the number of rows in the specified object. System tags are not set on any columns in the specified object.

{}

An empty object, which is functionally equivalent to specifying NULL.

{'sample_count': integer}

Specifies the number of rows to sample in the specified object. Any number from 1 to 10000, inclusive.

{'auto_tag': true}

Sets the recommended classification system tags on the columns in the specified object when the classification process is complete.

When you use this argument, call the stored procedure with the role that has the OWNERSHIP privilege on the schema.

{'sample_count': integer, 'auto_tag': true}

Classify the specified object while specifying the number of rows to sample and set the recommended system tag on each column in the specified object when the classification process is complete.

When you use this argument, call the stored procedure with the role that has the OWNERSHIP privilege on the schema.

{'use_all_custom_classifiers': true}

Snowflake evaluates all custom classification instances and recommends the tag associated with a custom classification instance based on the classification result.

This option uses the custom classifiers that are accessible to the role in use that calls the stored procedure (current role, caller’s rights). For information, see Understanding caller’s rights and owner’s rights stored procedures.

{'custom_classifiers': ['instance_name1' [ , 'instance_name2' ... ] ]}

Specifies the custom classification instance to evaluate as a source for the recommended tag to be set on the column.

You can specify multiple instances in the list and separate each instance with a comma.

返回

存储过程会使用以下格式返回 JSON 对象。例如:

{
  "failed": [
    {
      "message": "Insufficient privileges.",
      "table_name": "t4"
    }
  ],
  "succeeded": [
    {
      "table_name": "t1"
    },
    {
      "table_name": "t2"
    },
    {
      "table_name": "t3"
    }
  ]
}

其中:

failed

指定一条消息,说明未计划对该表进行分类的原因。

succeeded

指定为 Data Classification 而计划的每个表。

使用说明

Caution

调用此存储过程时,架构中每个表的分类过程并行运行并消耗仓库资源。如果在短时间内多次调用此存储过程以同时对架构中的表进行分类,则这些过程也会并行运行。许多并行分类过程可能超出仓库能力,这将导致一些表的分类过程失败。因此,一个架构可能会对一些表进行分类,而对其他表不进行分类。

Prior to calling SYSTEM$CLASSIFY_SCHEMA, evaluate the number of columns in each table, the number of tables in a schema, the number of schemas that you want to classify, and the warehouse size that is in use for the session.

示例

对架构表的分类进行暂存:

CALL SYSTEM$CLASSIFY_SCHEMA('hr.tables', null);

对架构表的分类进行暂存,并指定要抽样的行数:

CALL SYSTEM$CLASSIFY_SCHEMA('hr.tables', {'sample_count': 1000});

对架构表的分类进行暂存,并为列设置系统标签:

CALL SYSTEM$CLASSIFY_SCHEMA('hr.tables', {'auto_tag': true});

对架构表的分类进行暂存,指定要抽样的行数,并为表中的每列设置推荐的系统标签:

CALL SYSTEM$CLASSIFY_SCHEMA('hr.tables', {'sample_count': 1000, 'auto_tag': true});