How Snowflake validates semantic views¶
Snowflake verifies that it complies with a set of validation rules when you define a semantic view. These rules ensure your semantic model is well-formed and will function correctly.
These rules are explained in the next sections:
General validation rules¶
The following rules apply to semantic views in general:
Required elements: A semantic view must define at least one dimension or metric.
For example, your TPC-H semantic view needs at least one dimension (like
customer_name
) or a metric (likeorder_average_value
).Primary and foreign keys: In the primary and foreign key definitions, you must use physical base table columns or expressions defined in logical tables that directly refer to a base table column (for example,
t1.fact AS t1.col
).For example, in the TPC-H schema, you can use
c_custkey
as the primary key for thecustomer
table ando_custkey
as the foreign key in theorders
table.c_custkey
ando_custkey
are columns in the physical base tables.Table alias references: When referring to tables in relationships or expressions, you must use their defined aliases.
For example, if you define the table alias
orders AS snowflake_sample_data.tpch.orders_table
, you must use the table aliasorders
(notorders_table
) in the definitions of your metrics.If you don’t specify an alias for a logical table, you must use the logical table name in any expressions.
Validation rules for relationships¶
The following rules apply to relationships in semantic views:
Many-to-one relationships: Relationships work like foreign key constraints.
For example, when you define a relationship as
orders (o_custkey) REFERENCES customers (c_custkey)
, you’re creating a many-to-one relationship fromorders
tocustomers
(many orders can belong to one customer), wherec_custkey
must be a primary key.Transitive relationships: Snowflake automatically derives indirect relationships.
For example, if you define a relationship between
line_items
andorders
and another relationship betweenorders
andcustomer
, Snowflake understands there’s also a relationship betweenline_items
andcustomer
.No circular relationships: You cannot define circular relationships, even through transitive paths.
For example, you cannot define a relationship from
orders
tocustomer
and another relationship fromcustomer
toorders
.No self-references: Currently, a table cannot reference itself (like an employee manager hierarchy where employees can reference other employees as their manager).
Multi-path relationship restrictions: You can define multiple relationships between two tables, but there are limitations.
For example, if
line_items
is related toorders
through bothorder_key
and another column, those tables cannot refer to each other’s semantic expressions.One-to-one relationship restrictions: One-to-one relationships have limitations.
For example, suppose that you define a relationship
orders(id) REFERENCES order_summary
, whereid
is a primary keyorders
.order_summary
cannot refer to semantic expressions inorders
semantic expressions, thoughorders
can refer to semantic expressions inorder_summary
.
Expression validation rules¶
The following rules apply to semantic expressions in facts, dimensions, and metrics:
General rules about expressions¶
The following rules apply to semantic expressions in general:
Expression types: Dimensions and facts are row-level expressions (unaggregated), while metrics are aggregate-level expressions.
For example,
customer_name
is a dimension (row-level), whileorder_average_value
is a metric (aggregate-level).Table association: Every semantic expression must be associated with a table.
For example,
customer_name
must be defined ascustomer.customer_name
andorder_average_value
asorders.order_average_value
.Same-table references: Expressions can refer to base table columns or other expressions on the same logical table using either qualified or unqualified names.
For example, in the
orders
table, you could defineorders.shipping_month
asMONTH(o_shipdate)
(using the unqualified column name)MONTH(orders.o_shipdate)
(using the qualified name)
Cross-table limitations: Expressions cannot refer to base table columns from other tables or expressions from unrelated logical tables.
For example,
customer.customer_name
cannot directly reference an expression from theorders
table unless there’s a relationship between them. To work with data across tables, you must:Define relationships between logical tables (for example, between
customer
andorders
throughc_custkey
).Define a fact on the source table (for example,
orders.total_value
).Refer to these expressions from a connected logical table (for example,
customer.order_value
can refer toorders.total_value
).
Name resolution: If both a semantic expression and a column have the same name, references to that name resolve to the semantic expression.
For example, if you define a
region
dimension and there’s also aregion
column,region
in expressions resolves to the dimension, not the column. An exception is when an expression refers to the same name in its definition (for example,customer.c_name AS customers.c_name
). The reference resolves to the column, rather than to the defining expression itself.Expression reference cycles: You cannot create circular references between expressions.
For example, you cannot define
customer.total_value
based onorders.customer_value
and then defineorders.customer_value
based oncustomer.total_value
.Table reference cycles: You cannot create circular references between logical tables in expression definitions.
For example, you cannot define
customer.total_value
based onorders.customer_value
and then defineorders.customer_count
based oncustomer.c_custkey.
Function usage: You can use scalar functions like YEAR* / DAY* / WEEK* / MONTH / QUARTER in dimensions, but table functions are not allowed. Window functions are allowed in dimensions and facts but not in metrics.
Rules for row-level expressions (dimensions and facts)¶
The following rules apply to row-level expressions in dimensions and facts:
Same-table references: A row-level expression can directly refer to columns from its own table.
For example,
customers.customer_name
can be defined ascustomers.c_name
directly.Equal or lower granularity: A row-level expression can directly refer to other row-level expressions at the same or lower granularity.
For example,
orders.order_details
can refer tocustomer.customer_name
becausecustomer
is at a lower granularity thanorders
(one customer can have many orders).Higher granularity references: When referencing row-level expressions at higher granularity, a row-level expression must use aggregation.
For example,
customer.total_orders
must useCOUNT(orders.o_orderkey)
becauseorders
is at a higher granularity thancustomer
(one customer can have many orders).Aggregate references: A dimension like
orders.order_type
cannot refer to a metric likeorders.order_average_value
, butcustomer.customer_segment
can refer toorders.order_average_value
becausecustomer
is at a lower granularity than orders.
Rules for aggregate-level expressions (metrics)¶
The following rules apply to aggregate-level expressions in metrics:
Basic aggregation: A metric must use an aggregate function.
For example,
orders.order_average_value
must useAVG(orders.o_totalprice)
.Equal or lower granularity: When referring to row-level expressions at equal or lower granularity, a metric must use a single aggregate.
For example,
orders.total_value
can useSUM(line_items.discounted_price)
becauseline_items
is at lower granularity than orders.Higher granularity references: When referring to row-level expressions at higher granularity, a metric must use nested aggregation.
For example,
customer.average_order_value
must useAVG(SUM(orders.o_totalprice))
becauseorders
is at higher granularity thancustomer
.Other aggregate references: A metric can directly refer to other metrics at equal or lower granularity without aggregation.
For example,
orders.profit_margin
can be defined asorders.total_revenue / orders.total_cost
without additional aggregation. However, when referring to metrics at higher granularity, an aggregation is required.