Speeding up join queries with search optimization¶
The search optimization service can improve the performance of join queries that have a small number of distinct values on the build side of the join.
For example, the search optimization service can improve the performance of these types of joins:
Suppose that
products
is a table containing a row for each product, andsales
is a table containing a row for each sale of a product. Theproducts
table contains fewer rows and is smaller than thesales
table. To find all sales of a specific product, you join thesales
table (the larger table) with theproducts
table (the smaller table). Because theproducts
table is small, there are few distinct values on the build side of the join.Note
In data warehousing, the large table is often referred to as the fact table (link removed). The small table is referred to as the dimension table (link removed). The rest of this topic uses these terms when referring to the large table and the small table in a join.
Suppose that
customers
is a table containing a row for each customer, andsales
is a table containing a row for each sale. Both tables are large. To find all sales for a specific customer, you join thesales
table (the probe side) with thecustomers
table (the build side) and use a filter so that there are a small number of distinct values on the build side of the join.
Enabling the search optimization service to improve the performance of joins¶
To enable the search optimization service to improve the performance of joins, add search optimization to the table on the probe side of the join. This table is usually a large table that isn’t filtered in join queries, such as a fact table.
To take advantage of search optimization, make sure the build side of the join has a small number of distinct values, either because it’s a small dimension table or because of a selective filter. The search optimization costs of a query are proportionate to the number of distinct values that must be looked up on the build side of the join. If this number is too large, Snowflake might decide against using the search access path and use the regular table access path instead.
Supported join predicates¶
The search optimization service can improve the performance of queries with the following types of join predicates:
Equality predicates of the form
probe_side_table.column = build_side_table.column
.Transformations on the build-side operand of the predicate (for example, string concatenation, addition, and so on).
Conjunctions (
AND
) of multiple equality predicates.
Examples of supported queries¶
This section shows examples of join queries that can benefit from search optimization.
Example: Simple equality predicate¶
The following is an example of a supported query that uses a simple equality predicate as the join predicate. This query joins a
table named sales
with a table named customers
. The probe-side table sales
is large and has search optimization
enabled. The build-side table customers
is also large, but the input from this table is small, due to the selective filter on the
customer_id
column.
SELECT sales.date, customer.name
FROM sales JOIN customers ON (sales.customer_id = customers.customer_id)
WHERE customers.customer_id = 2094;
Example: Predicate transformed on the dimension-side operand¶
The following query joins a fact table named sales
with a dimension table named products
. The fact table is large and
has search optimization enabled. The dimension table is small.
This query transforms the dimension-side operand of the predicate (for example, by multiplying values in the join condition) and can benefit from search optimization:
SELECT sales.date, product.name
FROM sales JOIN products ON (sales.product_id = product.old_id * 100)
WHERE product.category = 'Cutlery';
Example: Predicate spanning multiple columns¶
Queries in which a join predicate spans multiple columns can benefit from search optimization:
SELECT sales.date, product.name
FROM sales JOIN products ON (sales.product_id = product.id and sales.location = product.place_of_production)
WHERE product.category = 'Cutlery';
Example: Query using point-lookup filters and join predicates¶
In a query that uses both regular point-lookup filters and join predicates, the search optimization service can improve the
performance of both. In the following query, the search optimization service can improve the sales.location
point-lookup
predicate as well as the product_id
join predicate:
SELECT sales.date, product.name
FROM sales JOIN products ON (sales.product_id = product.id)
WHERE product.category = 'Cutlery'
AND sales.location = 'Buenos Aires';
Limitations¶
The following limitations apply to the search optimization service and join queries:
Disjuncts (
OR
) in join predicates currently aren’t supported.LIKE, ILIKE, and RLIKE join predicates currently aren’t supported.
Join predicates on VARIANT columns currently aren’t supported.
EQUAL_NULL equality predicates currently aren’t supported.
The current limitations of the search optimization service also apply to join queries.