System data metric functions¶
This topic is a reference for the system data metric functions (DMFs) that Snowflake provides to all accounts. DMFs are the building block of data quality checks.
About system DMFs¶
Snowflake provides system DMFs in the CORE schema of the shared SNOWFLAKE database. System DMFs are maintained by Snowflake; you cannot change the name or functionality of any system DMF.
Each system DMF enables you to measure a different data quality attribute. You can assign more than one system DMF to a table or view to allow for a more comprehensive data quality measurement to address your governance and compliance needs.
System DMFs¶
Currently, Snowflake supports these system DMFs to measure common metrics without having to define them:
| Category | System DMF | Description |
|---|---|---|
| Accuracy | BLANK_COUNT | Determine how many blank values are in a column. |
| BLANK_PERCENT | Determine what percentage of a column’s values are blank. | |
| CASE_FORMAT_VIOLATION_COUNT | Determine how many non-NULL values in a string column have inconsistent casing (not all-uppercase, all-lowercase, or title-case). | |
| CASE_FORMAT_VIOLATION_PERCENT | Determine what percentage of non-NULL values in a string column have inconsistent casing. | |
| FUTURE_TIMESTAMP_COUNT | Determine how many values in a date/timestamp column are in the future relative to the scheduled evaluation time. | |
| FUTURE_TIMESTAMP_PERCENT | Determine what percentage of values in a date/timestamp column are in the future relative to the scheduled evaluation time. | |
| INVALID_JSON_COUNT | Determine how many non-NULL values in a string column are not valid JSON. | |
| INVALID_JSON_PERCENT | Determine what percentage of non-NULL values in a string column are not valid JSON. | |
| INVALID_NUMERIC_TYPE_CAST_COUNT | Determine how many non-NULL values in a string column cannot be parsed as numeric. | |
| INVALID_NUMERIC_TYPE_CAST_PERCENT | Determine what percentage of non-NULL values in a string column cannot be parsed as numeric. | |
| NEGATIVE_COUNT | Determine how many values in a numeric column are negative. | |
| NEGATIVE_PERCENT | Determine what percentage of values in a numeric column are negative. | |
| NULL_COUNT | Determine how many NULL values are in a column. | |
| NULL_PERCENT | Determine what percentage of a column’s values are NULL. | |
| SPECIAL_CHARACTER_COUNT | Determine how many non-NULL values in a string column contain characters outside the alphanumeric range. | |
| SPECIAL_CHARACTER_PERCENT | Determine what percentage of non-NULL values in a string column contain characters outside the alphanumeric range. | |
| UNTRIMMED_STRING_COUNT | Determine how many non-NULL values in a string column have leading or trailing whitespace. | |
| UNTRIMMED_STRING_PERCENT | Determine what percentage of non-NULL values in a string column have leading or trailing whitespace. | |
| ZERO_COUNT | Determine how many values in a numeric column are equal to zero. | |
| ZERO_PERCENT | Determine what percentage of values in a numeric column are equal to zero. | |
| Freshness | FRESHNESS | Determine the freshness of a table’s data based on a timestamp column or the most recent DML operation. |
| DATA_METRIC_SCHEDULE_TIME | Define custom freshness metrics. | |
| Statistics | APPROX_QUANTILE_25 | Determine the approximate 25th percentile value for a numeric column. |
| APPROX_QUANTILE_50 | Determine the approximate 50th percentile (median) value for a numeric column. | |
| APPROX_QUANTILE_99 | Determine the approximate 99th percentile value for a numeric column. | |
| AVG | Determine the average value of a column. | |
| MAX | Determine the maximum value of a column. | |
| MEDIAN | Determine the exact median value for a numeric column. | |
| MIN | Determine the minimum value of a column. | |
| STDDEV | Determine the standard deviation value for a column. | |
| STRING_LENGTH_AVG | Determine the average string length of non-NULL values for a string column. | |
| STRING_LENGTH_MAX | Determine the maximum string length of non-NULL values for a string column. | |
| STRING_LENGTH_MIN | Determine the minimum string length of non-NULL values for a string column. | |
| VARIANCE | Determine the variance value for a numeric column. | |
| Uniqueness | ACCEPTED_VALUES | Determine whether values in a column match a Boolean expression. |
| DUPLICATE_COUNT | Determine the number of duplicate values in a column, including NULL values. | |
| UNIQUE_COUNT | Determine the number of unique, non-NULL values in a column. | |
| Volume | ROW_COUNT | Determine how many records are in the table or view. |