Excluding data from automatic sensitive data classification¶
With automatic sensitive data classification, Snowflake classifies whether data is sensitive at regular intervals without user intervention. You enable the feature by defining a classification profile, then setting it on the database that contains the data you want classified.
You can use classification profile settings and system tags to exclude certain data from automatic classification.
For example, suppose a database my_db
has three tables, t1
, t2
, and t3
. By default, when you set a classification profile on
my_db
, all three tables are automatically classified. You can configure Snowflake to skip t2
during automatic classification so only
tables t1
and t3
are classified.
Workflow¶
Excluding data from automatic sensitive data classification is a two-step process:
Apply the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag to every object that you want excluded from automatic sensitive data classification. Learn more
Set the
enable_tag_based_sensitive_data_exclusion
key of the classification profile totrue
. Learn more
This process is known as tag-based sensitive data exclusion.
Note
After you apply the system tag and configure the classification profile, if you call the SYSTEM$CLASSIFY stored procedure and specify the classification profile, Snowflake excludes the tagged objects from classification.
Set tag on data objects¶
An object tag is an object that can be set on another object. Snowflake
provides a system-defined tag, SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION, that you can set on objects that you want excluded from
automatic sensitive data classification. When the value of this tag is TRUE
, then Snowflake skips the object during classification.
For example, if you set SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION = 'TRUE'
on a table, then Snowflake skips the table when it
automatically classifies the table’s database.
You can set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag on a schema, table, or column to control which data is excluded from automatic sensitive data classification.
Set tag on a schema¶
When a classification profile is set on a database, all of the schemas in the database are classified during automatic data classification. You can set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag on a schema in the database to exclude the schema from the classification process.
For example, suppose you want to automatically classify all schemas in a database except the schema internal
. You can run the
ALTER SCHEMA command to set the system-defined tag on the schema:
ALTER SCHEMA internal SET TAG SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION = 'TRUE';
When Snowflake automatically classifies data in the database, it skips data in the schema internal
.
For the access control requirements for setting the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag, see Access control requirements.
Set tag on a table¶
When a classification profile is set on a database or schema, all of the tables in that object are classified during automatic data classification. You can set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag on a table in the database or schema to exclude the table from the classification process.
For example, suppose you want to automatically classify all tables in a database except the table my_table
. You can run the
ALTER TABLE command to set the system-defined tag on the table:
ALTER TABLE my_table SET TAG SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION = 'TRUE';
When Snowflake automatically classifies data in the database, it skips data in the table my_table
.
Set tag on a column¶
You might want to automatically classify some, but not all, of the columns of a table. You can set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag on a column so Snowflake skips it when classifying the rest of the table. If you exclude a column, the classification result contains an empty value for the column, even if it contains sensitive data.
For example, suppose you want to automatically classify all columns in a table except the column employee_id
. You can run the
ALTER TABLE … ALTER COLUMN command to set the system-defined tag on the column:
ALTER TABLE my_table ALTER COLUMN employee_id
SET TAG SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION = 'TRUE';
When Snowflake automatically classifies data in the table, the employee_id
field in the JSON result is empty.
Define classification profile¶
A classification profile contains the settings that control how Snowflake automatically classifies data in a database or schema. These settings are specified using key-value pairs in an OBJECT.
You must define the enable_tag_based_sensitive_data_exclusion
key of the classification profile if you want data excluded from
automatic classification. If you do not define the value of this key as true
, setting the
SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION on objects has no effect.
The following is an example of a classification profile that, when set on a database, excludes properly tagged objects from automatic classification:
CREATE OR REPLACE SNOWFLAKE.DATA_PRIVACY.CLASSIFICATION_PROFILE
my_classification_profile(
{
'minimum_object_age_for_classification_days': 0,
'maximum_classification_validity_days': 30,
'auto_tag': true,
'enable_tag_based_sensitive_data_exclusion': true
});
Use a method to set a classification profile’s key¶
If you have an existing classification profile, you can call the
SET_ENABLE_TAG_BASED_SENSITIVE_DATA_EXCLUSION
method to set the enable_tag_based_sensitive_data_exclusion
key of the profile.
To exclude objects tagged with SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION, call the method with its argument set to true
. For
example, to allow data to be excluded from the classification of a database that has my_classification_profile
set on it, run the
following command:
CALL my_classification_profile!SET_ENABLE_TAG_BASED_SENSITIVE_DATA_EXCLUSION(true);
To disable tag-based sensitive data exclusion for a classification profile, run the command with its argument set to false
:
CALL my_classification_profile!SET_ENABLE_TAG_BASED_SENSITIVE_DATA_EXCLUSION(false);
Access control requirements¶
The following sections describe the roles and privileges that you must have to exclude objects from automatic data classification:
Requirements for setting the tag¶
The SNOWFLAKE.CLASSIFICATION_ADMIN database role is necessary to create a classification profile. This same database role is required to set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION system tag on a schema, table, or column. Without additional privileges, a user with the SNOWFLAKE.CLASSIFICATION_ADMIN database role can only set the system tag on the objects that they own.
For example, to let users with the role classify_admin
set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION tag on objects that
they own, run the following command:
GRANT DATABASE ROLE SNOWFLAKE.CLASSIFICATION_ADMIN TO ROLE classify_admin;
If you want an administrator to be able to set the SNOWFLAKE.CORE.SKIP_SENSITIVE_DATA_CLASSIFICATION system tag on any object, not just the ones they own, run the following commands:
GRANT DATABASE ROLE SNOWFLAKE.CLASSIFICATION_ADMIN TO ROLE classify_admin;
GRANT APPLY TAG ON ACCOUNT TO ROLE classify_admin;
Requirements for configuring the classification profile¶
You must have the PRIVACY_USER instance role on the classification profile to set the
enable_tag_based_sensitive_data_exclusion
key of the classification profile.