Understanding How Snowflake Can Eliminate Redundant Joins

In some cases, a join on a key column can refer to tables that are not needed for the join. If your tables have key columns and you are using and enforcing the UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints, Snowflake can improve query performance by eliminating unnecessary joins on key columns.

These optimizations are performed only if you use the RELY constraint property to indicate that the data in your tables complies with the constraints around primary keys and foreign keys.

Setting the RELY Constraint Property to Eliminate Unnecessary Joins

Snowflake only performs this optimization on joins if you indicate that the data in your tables comply with the UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints.

As mentioned in Supported Constraint Types, Snowflake does not enforce UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints on standard tables, but does enforce them on hybrid tables. For standard tables, you are responsible for enforcing constraints on the data.

If you have ensured that the data complies with these constraints and you want Snowflake to eliminate unnecessary joins, set the RELY constraint property on the UNIQUE, PRIMARY KEY, FOREIGN KEY constraints.

Note

You are responsible for maintaining the integrity of your constraints (UNIQUE, PRIMARY KEY, and FOREIGN KEY). If the integrity of your constraints is not maintained, the query results might differ if the RELY constraint property is set (compared to the results with NORELY).

Examples of Eliminating Unnecessary Joins

The following examples demonstrates cases in which Snowflake eliminates joins and references to tables that are not necessary:

In these examples:

  • dim_products is a table that contains a row for each product available for purchase.

    In this table, product_id is a column that uniquely identifies a product.

  • fact_sales is a table that contains a row for each sale of a product.

    In this table, product_id is a column that identifies the product that was sold. The IDs in this column correspond to the IDs in the product_id column of the dim_products table.

Example 1: Eliminating an Unnecessary Left Outer Join

This following is an example of an unnecessary left outer join that Snowflake can optimize:

SELECT f.*
FROM fact_sales f
LEFT OUTER JOIN dim_products p
ON f.product_id = p.product_id;
Copy

The join is unnecessary because the statement does not refer to any columns in the dim_products table on the right (other than the primary key column for the join).

If the dim_products.product_id column has the UNIQUE or PRIMARY KEY constraint with the RELY property, Snowflake can identify this join as unnecessary and can eliminate the reference to the dim_products table on the right.

Example 2: Eliminating an Unnecessary Self-Join

This following is an example of an unnecessary self-join that Snowflake can optimize:

SELECT p1.product_id, p2.product_name
FROM dim_products p1, dim_products p2
WHERE p1.product_id = p2.product_id;
Copy

The statement unnecessarily joins the dim_products table with itself and selects columns from that table.

If the dim_products.product_id column has the UNIQUE or PRIMARY KEY constraint with the RELY property, Snowflake can identify this join as unnecessary and can eliminate the reference to the dim_products table on the right.

Example 3: Eliminating an Unnecessary Join on a Primary Key and Foreign Key

This following is an example of an unnecessary inner join that Snowflake can optimize.

SELECT p.product_id, f.units_sold
FROM   fact_sales f, dim_products p
WHERE  f.product_id = p.product_id;
Copy

The statement does not refer to any columns in the dim_products table on the right, other than the primary key column for the join.

If the dim_products.product_id column has the PRIMARY KEY constraint and the fact_sales.product_id column has the FOREIGN KEY constraint, Snowflake can identify this join as unnecessary and can eliminate the reference to the dim_products table on the right.

Language: English