- 类别:
WITH¶
WITH 子句是位于 SELECT 语句正文前面的可选子句,它定义了一个或多个可稍后在语句中使用的 CTEs (公用表表达式)。例如,可以在 FROM 子句中引用 CTEs。
备注
在创建和调用类似于存储过程的匿名过程时,可以使用 WITH 子句。该子句修改的是 CALL 命令而不是 SELECT 命令。有关更多信息,请参阅 CALL(使用匿名过程)。
WITH 子句与机器学习模型对象一起使用,以创建模型特定版本的别名,此别名随后可用于调用该版本的方法。请参阅 在 SQL 中调用模型方法。
- 另请参阅:
语法¶
子查询:
[ WITH
<cte_name1> [ ( <cte_column_list> ) ] AS ( SELECT ... )
[ , <cte_name2> [ ( <cte_column_list> ) ] AS ( SELECT ... ) ]
[ , <cte_nameN> [ ( <cte_column_list> ) ] AS ( SELECT ... ) ]
]
SELECT ...
递归 CTE:
[ WITH [ RECURSIVE ]
<cte_name1> ( <cte_column_list> ) AS ( anchorClause UNION ALL recursiveClause )
[ , <cte_name2> ( <cte_column_list> ) AS ( anchorClause UNION ALL recursiveClause ) ]
[ , <cte_nameN> ( <cte_column_list> ) AS ( anchorClause UNION ALL recursiveClause ) ]
]
SELECT ...
其中:
anchorClause ::= SELECT <anchor_column_list> FROM ... recursiveClause ::= SELECT <recursive_column_list> FROM ... [ JOIN ... ]
cte_name1
、cte_nameN
CTE 名称必须遵循视图和类似 对象标识符 的规则。
cte_column_list
CTE (公用表表达式)中的列名。
anchor_column_list
递归 CTE 的锚定子句中使用的列。此列表中的列必须与
cte_column_list
中定义的列相对应。recursive_column_list
递归 CTE 的递归子句中使用的列。此列表中的列必须与
cte_column_list
中定义的列相对应。
有关更多详细信息,请参阅 锚定子句 和 递归子句 (本主题内容)。有关锚定子句和递归子句如何协同工作的详细说明,请参阅 使用 CTEs (公用表表达式)。
使用说明¶
一般用法¶
WITH 子句可以递归地引用自身,也可以引用同一子句中较早出现的其他 CTEs。例如,
cte_name2
可以引用cte_name1
及其自身,而cte_name1
可以引用其自身,但不能引用cte_name2
。可以在 WITH 子句中混合使用递归和非递归(迭代和非迭代) CTE 子句。CTE 子句的顺序应该是,如果一个 CTE 需要引用另一个 CTE,则应在语句的前面定义要引用的 CTE(例如,第二个 CTE 可以引用第一个 CTE,但反之则不然)。
CTEs 不需要根据它们是否递归来按顺序列出。例如,非递归 CTE 可以紧跟在关键字
RECURSIVE
之后列出,而递归 CTE 可以紧跟在非递归 CTE 之后。在递归 CTE 中,锚定子句或递归子句(或两者)都可以引用另一个 CTE。
对于递归 CTEs,
cte_column_list
是必需的。对于非递归 CTEs,
cte_column_list
是可选的。确保在递归 CTE 中使用的是
UNION ALL
,而不是UNION
。关键字
RECURSIVE
是可选的。无论是否指定了
RECURSIVE
,CTEs 都可以递归。即使没有递归 CTEs,也可以使用关键字
RECURSIVE
。如果使用
RECURSIVE
,则只能使用一次,即使多个 CTE 是递归的。
尽管无论有没有关键字
RECURSIVE
, SQL 语句都可以正常工作,但正确使用关键字会使代码更易于理解和维护。如果一个或多个 CTEs 是递归的,Snowflake 建议使用关键字RECURSIVE
。如果没有 CTEs 是递归的,Snowflake 强烈建议忽略该关键字。
注意
使用递归 CTE 时,可以创建一个查询,该查询进入无限循环并消耗 Credit,直到查询成功、查询超时(例如,超过 STATEMENT_TIMEOUT_IN_SECONDS 参数指定的秒数),或者 取消查询。
有关无限循环如何发生以及如何避免此问题的指南的信息,请参阅 对递归 CTE 进行故障排除。
例如,要将迭代次数限制在 10 次以下,请执行以下操作:
WITH cte AS (
SELECT ..., 1 as level ...
UNION ALL
SELECT ..., cte.level + 1 as level
FROM cte ...
WHERE ... level < 10
) ...
限制¶
递归 CTEs 的 Snowflake 实现不支持某些其他系统支持的以下关键字:
SEARCH DEPTH FIRST BY ...
CYCLE ... SET ...
锚定子句¶
递归 CTE 中的锚定子句是 SELECT 语句。
锚定子句在其嵌入的语句执行期间执行 一次 ;它在递归子句之前运行,并从递归 CTE 生成第一组行。这些行不仅包含在查询的输出中,而且还由递归子句引用。
锚定子句可以包含 SELECT 子句中允许的任何 SQL 结构。但是,锚定子句不能引用 cte_name1
;只有递归子句才能引用 cte_name1
。
尽管锚定子句通常与递归子句从同一表中进行选择,但这不是必需的。锚定子句可以从任何类似表的数据源中进行选择,包括另一个表、视图、UDTF 或常量值。
锚定子句选择层次结构的单个“级别”,通常是最高级别或最高兴趣级别。例如,如果查询旨在显示汽车的“零件分解图”,则锚定子句将返回最高级别的组件,即汽车本身。
锚定子句的输出表示层次结构的一层,此层存储为在递归子句的第一次迭代中访问的“视图”的内容。
递归子句¶
递归子句是 SELECT 语句。此 SELECT 仅限于投影、筛选和联接(内联接和外联接,其中递归引用位于外联接的保留侧)。递归子句不能包含:
聚合函数或窗口函数、
GROUP BY
、ORDER BY
、LIMIT
或DISTINCT
。
递归子句可以(并且通常确实)引用 cte_name1
,就像 CTE 是表或视图一样。
递归子句通常包括 JOIN,用于将锚定子句中使用的表联接到 CTE。但是, JOIN 可以联接多个表或类似表的数据源(视图等)。
递归子句的第一次迭代从锚定子句中的数据开始。然后,该数据将联接到递归子句的 FROM 子句中的其他表。
每个后续迭代都从上一次迭代的数据开始。
您可以将 CTE 子句或“视图”视为保存上一次迭代中的内容,以便可以联接这些内容。请注意,在任何一次迭代期间, CTE 仅包含上一次迭代的内容,而不包含所有先前迭代累积的结果。累积的结果(包括来自锚定子句的结果)存储在单独的位置。
递归 CTE 中的列的列表¶
递归 CTE 中有三个列的列表:
cte_column_list
anchor_column_list
(在锚定子句中)recursive_column_list
(在递归子句中)
递归 CTE 可以包含其他列的列表(例如在子查询中),但 必须 存在这三个列的列表。
这三个列的列表必须全部相互对应。
在伪代码中,这看起来类似于:
WITH RECURSIVE cte_name (X, Y) AS
(
SELECT related_to_X, related_to_Y FROM table1
UNION ALL
SELECT also_related_to_X, also_related_to_Y
FROM table1 JOIN cte_name ON <join_condition>
)
SELECT ... FROM ...
列 X
和 related_to_X
必须对应;锚定子句生成 CTE 表示的“视图”的初始“内容”,因此锚定子句(例如列 related_to_x
)中的每一列都必须生成属于 CTE 的相应列(例如列 X
)的输出。
列 also_related_to_X
和 X
必须对应;在递归子句的每次迭代中,该子句的输出将成为下一次迭代的 CTE/视图的新内容。
此外,列 related_to_X
和 also_related_to_X
必须对应,因为它们分别位于 UNION ALL
运算符的一侧,并且 UNION ALL
运算符每一侧的列必须对应。
示例¶
非递归示例¶
本节提供了示例查询和示例输出。为了保持示例简短,代码省略了用于创建和加载表的语句。
第一个示例使用一个简单的 WITH 子句作为视图来提取数据的子集,在本例中是 1976 年发行的音乐专辑。对于这个小型数据库,查询输出是 1976 年的专辑“Amigos”和“Look Into The Future”:
with albums_1976 as (select * from music_albums where album_year = 1976) select album_name from albums_1976 order by album_name; +----------------------+ | ALBUM_NAME | |----------------------| | Amigos | | Look Into The Future | +----------------------+
下一个示例将 WITH 子句与早前的 WITH 子句结合使用;名为 journey_album_info_1976
的 CTE 使用名为 album_info_1976
的 CTE。输出是专辑“Look Into The Future”,其中乐队名称如下:
with album_info_1976 as (select m.album_ID, m.album_name, b.band_name from music_albums as m inner join music_bands as b where m.band_id = b.band_id and album_year = 1976), Journey_album_info_1976 as (select * from album_info_1976 where band_name = 'Journey') select album_name, band_name from Journey_album_info_1976; +----------------------+-----------+ | ALBUM_NAME | BAND_NAME | |----------------------+-----------| | Look Into The Future | Journey | +----------------------+-----------+
此示例列出了在 Santana 专辑和 Journey 专辑中演奏的音乐家。此示例不使用 WITH 子句。对于此查询(以及接下来的几个查询,所有这些查询都是运行同一查询的等效方式),输出是在 Santana 专辑和 Journey 专辑中演奏的音乐家的 IDs 和姓名:
select distinct musicians.musician_id, musician_name from musicians inner join musicians_and_albums inner join music_albums inner join music_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = music_albums.album_ID and music_albums.band_ID = music_bands.band_ID and music_bands.band_name = 'Santana' intersect select distinct musicians.musician_id, musician_name from musicians inner join musicians_and_albums inner join music_albums inner join music_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = music_albums.album_ID and music_albums.band_ID = music_bands.band_ID and music_bands.band_name = 'Journey' order by musician_ID; +-------------+---------------+ | MUSICIAN_ID | MUSICIAN_NAME | |-------------+---------------| | 305 | Gregg Rolie | | 306 | Neal Schon | +-------------+---------------+
如您所见,上一个查询包含重复的代码。接下来的几个示例演示了如何使用一个或多个显式视图来简化此查询,然后如何使用 CTEs 简化此查询。
此查询演示了如何使用视图来减少上一个示例的重复和复杂性(与上一个示例一样,此查询不使用 WITH 子句):
create or replace view view_musicians_in_bands AS select distinct musicians.musician_id, musician_name, band_name from musicians inner join musicians_and_albums inner join music_albums inner join music_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = music_albums.album_ID and music_albums.band_ID = music_bands.band_ID;在此视图中,可以将原始查询重写如下:
select musician_id, musician_name from view_musicians_in_bands where band_name = 'Santana' intersect select musician_id, musician_name from view_musicians_in_bands where band_name = 'Journey' order by musician_ID; +-------------+---------------+ | MUSICIAN_ID | MUSICIAN_NAME | |-------------+---------------| | 305 | Gregg Rolie | | 306 | Neal Schon | +-------------+---------------+
此示例使用 WITH 子句执行与前面的查询相同的操作:
with musicians_in_bands as ( select distinct musicians.musician_id, musician_name, band_name from musicians inner join musicians_and_albums inner join music_albums inner join music_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = music_albums.album_ID and music_albums.band_ID = music_bands.band_ID) select musician_ID, musician_name from musicians_in_bands where band_name = 'Santana' intersect select musician_ID, musician_name from musicians_in_bands where band_name = 'Journey' order by musician_ID ; +-------------+---------------+ | MUSICIAN_ID | MUSICIAN_NAME | |-------------+---------------| | 305 | Gregg Rolie | | 306 | Neal Schon | +-------------+---------------+
这些语句创建更精细的视图(此示例不使用 WITH 子句):
列出特定乐队的专辑:
create or replace view view_album_IDs_by_bands AS select album_ID, music_bands.band_id, band_name from music_albums inner join music_bands where music_albums.band_id = music_bands.band_ID;列出在专辑中演奏的音乐家:
create or replace view view_musicians_in_bands AS select distinct musicians.musician_id, musician_name, band_name from musicians inner join musicians_and_albums inner join view_album_IDs_by_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = view_album_IDS_by_bands.album_ID;现在,使用这些视图来查询在 Santana 和 Journey 专辑中演奏的音乐家:
select musician_id, musician_name from view_musicians_in_bands where band_name = 'Santana' intersect select musician_id, musician_name from view_musicians_in_bands where band_name = 'Journey' order by musician_ID; +-------------+---------------+ | MUSICIAN_ID | MUSICIAN_NAME | |-------------+---------------| | 305 | Gregg Rolie | | 306 | Neal Schon | +-------------+---------------+
这些语句创建更精细的隐式视图(此示例使用 WITH 子句):
with album_IDs_by_bands as (select album_ID, music_bands.band_id, band_name from music_albums inner join music_bands where music_albums.band_id = music_bands.band_ID), musicians_in_bands as (select distinct musicians.musician_id, musician_name, band_name from musicians inner join musicians_and_albums inner join album_IDs_by_bands where musicians.musician_ID = musicians_and_albums.musician_ID and musicians_and_albums.album_ID = album_IDS_by_bands.album_ID) select musician_ID, musician_name from musicians_in_bands where band_name = 'Santana' intersect select musician_ID, musician_name from musicians_in_bands where band_name = 'Journey' order by musician_ID ; +-------------+---------------+ | MUSICIAN_ID | MUSICIAN_NAME | |-------------+---------------| | 305 | Gregg Rolie | | 306 | Neal Schon | +-------------+---------------+
递归示例¶
这是使用递归 CTE 生成斐波那契数列的基本示例:
WITH RECURSIVE current_f (current_val, previous_val) AS ( SELECT 0, 1 UNION ALL SELECT current_val + previous_val, current_val FROM current_f WHERE current_val + previous_val < 100 ) SELECT current_val FROM current_f ORDER BY current_val; +-------------+ | CURRENT_VAL | |-------------| | 0 | | 1 | | 1 | | 2 | | 3 | | 5 | | 8 | | 13 | | 21 | | 34 | | 55 | | 89 | +-------------+
此示例是一个使用递归 CTE 的查询,该查询显示汽车的“零件分解图”:
-- The components of a car. CREATE TABLE components ( description VARCHAR, component_ID INTEGER, quantity INTEGER, parent_component_ID INTEGER ); INSERT INTO components (description, quantity, component_ID, parent_component_ID) VALUES ('car', 1, 1, 0), ('wheel', 4, 11, 1), ('tire', 1, 111, 11), ('#112 bolt', 5, 112, 11), ('brake', 1, 113, 11), ('brake pad', 1, 1131, 113), ('engine', 1, 12, 1), ('piston', 4, 121, 12), ('cylinder block', 1, 122, 12), ('#112 bolt', 16, 112, 12) -- Can use same type of bolt in multiple places ;WITH RECURSIVE current_layer (indent, layer_ID, parent_component_ID, component_id, description, sort_key) AS ( SELECT '...', 1, parent_component_ID, component_id, description, '0001' FROM components WHERE component_id = 1 UNION ALL SELECT indent || '...', layer_ID + 1, components.parent_component_ID, components.component_id, components.description, sort_key || SUBSTRING('000' || components.component_ID, -4) FROM current_layer JOIN components ON (components.parent_component_id = current_layer.component_id) ) SELECT -- The indentation gives us a sort of "side-ways tree" view, with -- sub-components indented under their respective components. indent || description AS description, component_id, parent_component_ID -- The layer_ID and sort_key are useful for debugging, but not -- needed in the report. -- , layer_ID, sort_key FROM current_layer ORDER BY sort_key; +-------------------------+--------------+---------------------+ | DESCRIPTION | COMPONENT_ID | PARENT_COMPONENT_ID | |-------------------------+--------------+---------------------| | ...car | 1 | 0 | | ......wheel | 11 | 1 | | .........tire | 111 | 11 | | .........#112 bolt | 112 | 11 | | .........brake | 113 | 11 | | ............brake pad | 1131 | 113 | | ......engine | 12 | 1 | | .........#112 bolt | 112 | 12 | | .........piston | 121 | 12 | | .........cylinder block | 122 | 12 | +-------------------------+--------------+---------------------+
有关更多示例,请参阅 使用 CTEs (公用表表达式)。