Categories:: Query syntax

MATCH_RECOGNIZE¶

Recognizes matches of a pattern in a set of rows. MATCH_RECOGNIZE accepts a set of rows (from a table, view, subquery, or other source) as input, and returns all matches for a given row pattern within this set. The pattern is defined similarly to a regular expression.

此子句可以返回：

属于每个匹配项的所有行。
每个匹配项一个汇总行。

MATCH_RECOGNIZE is typically used to detect events in time series. For example, MATCH_RECOGNIZE can search a stock price history table for shapes like V (down followed by up) or W (down, up, down, up).

MATCH_RECOGNIZE is an optional subclause of the FROM clause.

Note

You cannot use the MATCH_RECOGNIZE clause in a recursive common table expression (CTE).

See also:: Identifying Sequences of Rows That Match a Pattern

语法

MATCH_RECOGNIZE (
    [ PARTITION BY <expr> [, ... ] ]
    [ ORDER BY <expr> [, ... ] ]
    [ MEASURES <expr> [AS] <alias> [, ... ] ]
    [ ONE ROW PER MATCH |
      ALL ROWS PER MATCH [ { SHOW EMPTY MATCHES | OMIT EMPTY MATCHES | WITH UNMATCHED ROWS } ]
      ]
    [ AFTER MATCH SKIP
          {
          PAST LAST ROW   |
          TO NEXT ROW   |
          TO [ { FIRST | LAST} ] <symbol>
          }
      ]
    PATTERN ( <pattern> )
    DEFINE <symbol> AS <expr> [, ... ]
)

必需的分子句

DEFINE：定义符号¶

DEFINE <symbol1> AS <expr1> [ , <symbol2> AS <expr2> ]

符号（也称为“模式变量”）是模式的基本组成部分。

符号由表达式定义。如果某行的表达式求值结果为 true，则向该行分配符号。可以向一行分配多个符号。

Symbols that are not defined in the DEFINE clause, but are used in the pattern, are always assigned to all rows. Implicitly, they are equivalent to the following example:

...
define
    my_example_symbol as true
...

Patterns are defined based on symbols and operators.

PATTERN：指定要匹配的模式¶

PATTERN ( <pattern> )

模式定义了表示匹配项的有效行序列。模式的定义类似于正则表达式 (regex)，由符号、运算符和量词构成。

For example, suppose that symbol S1 is defined as stock_price < 55, and symbol S2 is defined as stock price > 55. The following pattern specifies a sequence of rows in which the stock price increased from less than 55 to greater than 55:

PATTERN (S1 S2)

以下是有关模式定义的更复杂示例：

^ S1 S2*? ( {- S3 -} S4 )+ | PERMUTE(S1, S2){1,2} $

以下部分详细描述了此模式的各个组成部分。

Note

MATCH_RECOGNIZE uses backtracking to match patterns. As is the case with other regular expression engines that use backtracking, some combinations of patterns and data to match can take a long time to execute, which can result in high computation costs.

To improve performance, define a pattern that is as specific as possible:

Make sure that each row matches only one symbol or a small number of symbols
Avoid using symbols that match every row (e.g. symbols not in the DEFINE clause or symbols that are defined as true)
Define an upper limit for quantifiers (e.g. {,10} instead of *).

For example, the following pattern can result in increased costs if no rows match:

symbol1+ any_symbol* symbol2

If there is an upper limit to the number of rows that you want to match, you can specify that limit in the quantifiers to improve performance. In addition, rather than specifying that you want to find any_symbol that follows symbol1, you can look for a row that is not symbol1 (not_symbol1, in this example);

symbol1{1,limit} not_symbol1{,limit} symbol2

In general, you should monitor the query execution time to verify that the query is not taking longer than expected.

Symbols:

符号与其分配到的行相匹配。以下符号可用：

symbol. For example, S1, … , S4 Those are symbols that were defined in the DEFINE subclause and are evaluated per row. (These can also include symbols that were not defined and are automatically assigned to all rows.)
^ (Start of partition.) This is a virtual symbol that denotes the start of a partition and has no row associated with it. You can use it to require a match to start only at the beginning of a partition.

For an example, see Matching Patterns Relative to the Beginning or End of a Partition.
$ (End of partition.) This is a virtual symbol that denotes the end of a partition and has no row associated with it. You can use it to require a match to end only at the end of a partition.

For an example, see Matching Patterns Relative to the Beginning or End of a Partition.

Quantifiers:

量词可以放在符号或运算之后。量词表示关联符号或运算出现的最小和最大次数。以下量词可用：

Quantifier Meaning
+ 1 or more. For example, ( {- S3 -} S4 )+.
* 0 or more. For example, S2*?.
? 0 or 1.
{n} Exactly n.
{n,} n or more.
{,m} 0 to m.
{n, m} n to m. For example, PERMUTE(S1, S2){1,2}.

Quantifier	Meaning
`+`	1 or more. For example, `( {- S3 -} S4 )+`.
`*`	0 or more. For example, `S2*?`.
`?`	0 or 1.
`{n}`	Exactly n.
`{n,}`	n or more.
`{,m}`	0 to m.
`{n, m}`	n to m. For example, `PERMUTE(S1, S2){1,2}`.

By default, quantifiers are in “greedy mode”, which means they try to match the maximum quantity if possible. To put a quantifier into “reluctant mode”, in which the quantifier tries to match the minimum quantity if possible, place a ? after the quantifier (e.g. S2*?).

Operators:

运算符指定在行序列中应以何种顺序出现符号或其他运算，以形成有效的匹配。以下运算符可用：

Operator Meaning
... ... (space) Concatenation. Specifies that a symbol or operation should follow another one. For example, S1 S2 means that the condition defined for S2 should occur after the condition defined for S1.
{- ... -} Exclusion. Excludes the contained symbols or operations from the output. For example, {- S3 -} excludes operator S3 from the output. Excluded rows will not appear in the output, but will be included in the evaluation of MEASURES expressions.
( ... ) Grouping. Used to override the precedence of an operator or to apply the same quantifier for symbols or operations in the group. In this example, the quantifier + applies to the sequence {- S3 -} S4, not merely S4.
PERMUTE(..., ...) Permutation. Matches any permutation of the specified patterns. For example, PERMUTE(S1, S2) matches either S1 S2 or S2 S1. PERMUTE() takes an unlimited number of arguments.
... | ... Alternative. Specifies that either the first symbol or operation or the other one should occur. For example, ( S3 S4 ) | PERMUTE(S1, S2). The alternative operator has precedence over the concatenation operator.

Operator	Meaning
`... ...` (space)	Concatenation. Specifies that a symbol or operation should follow another one. For example, `S1 S2` means that the condition defined for `S2` should occur after the condition defined for `S1`.
`{- ... -}`	Exclusion. Excludes the contained symbols or operations from the output. For example, `{- S3 -}` excludes operator `S3` from the output. Excluded rows will not appear in the output, but will be included in the evaluation of `MEASURES` expressions.
`( ... )`	Grouping. Used to override the precedence of an operator or to apply the same quantifier for symbols or operations in the group. In this example, the quantifier `+` applies to the sequence `{- S3 -} S4`, not merely `S4`.
`PERMUTE(..., ...)`	Permutation. Matches any permutation of the specified patterns. For example, `PERMUTE(S1, S2)` matches either `S1 S2` or `S2 S1`. `PERMUTE()` takes an unlimited number of arguments.
`... \| ...`	Alternative. Specifies that either the first symbol or operation or the other one should occur. For example, `( S3 S4 ) \| PERMUTE(S1, S2)`. The alternative operator has precedence over the concatenation operator.

可选分子句

ORDER BY：匹配前对行进行排序¶

{ORDER BY orderItem1 [ , orderItem2 ... ]}

其中：

orderItem ::= { <column_alias> | <expr> } [ { ASC | DESC } ] [ NULLS { FIRST | LAST } ]

Define the order of the rows as you would for window functions. This is the order in which the individual rows of each partition are passed to the MATCH_RECOGNIZE operator.

For more information, see Partitioning and Sorting the Rows.

PARTITION BY：将行分区到窗口¶

{PARTITION BY <expr1> [ , <expr2> ... ]}

Partition the input set of rows as you would for window functions. MATCH_RECOGNIZE performs matching individually for each resulting partition.

分区不仅将彼此相关的行进行分组，还利用了 Snowflake 的分布式数据处理能力，因为可以并行处理不同的分区。

For more information about partitioning, see Partitioning and Sorting the Rows.

MEASURES：指定其他输出列¶

MEASURES <expr1> [AS] <alias1> [ ... , <exprN> [AS] <aliasN> ]

“Measures” are optional additional columns that are added to the output of the MATCH_RECOGNIZE operator. The expressions in the MEASURES subclause have the same capabilities as the expressions in the DEFINE subclause. For further information, see Symbols.

Within the MEASURES subclause, the following functions specific to MATCH_RECOGNIZE are available:

MATCH_NUMBER() Returns the sequential number of the match. The MATCH_NUMBER starts from 1, and is incremented for each match.
MATCH_SEQUENCE_NUMBER() Returns the row number within a match. The MATCH_SEQUENCE_NUMBER is sequential and starts from 1.
CLASSIFIER() Returns a TEXT value that contains the symbol that the respective row matched. For example, if a row matched the symbol GT75, then the CLASSIFIER function returns the string “GT75”.

Note

When specifying measures, remember the restrictions mentioned in the 在 DEFINE 和 MEASURES 中使用的窗口函数的相关限制 section.

ROW(S) PER MATCH：指定要返回的行¶

{
  ONE ROW PER MATCH  |
  ALL ROWS PER MATCH [ { SHOW EMPTY MATCHES | OMIT EMPTY MATCHES | WITH UNMATCHED ROWS } ]
}

指定要返回哪些行才能成功匹配。此分子句是可选的。

ALL ROWS PER MATCH: Return all rows in the match.
ONE ROW PER MATCH: Return one summary row for each match, regardless of how many rows are in the match. This is the default.

请注意以下特殊情况：

Empty Matches: An empty match happens if a pattern is able to match against zero rows. For instance, if the pattern is defined as A* and the first row at the beginning of a matching attempt is assigned to symbol B, then an empty match including only that row is generated, because the * quantifier in the A* pattern allows 0 occurrences of A to be treated as a match. The MEASURES expressions are evaluated differently for this row:
- CLASSIFIER 函数返回 NULL。
- 窗口函数返回 NULL。
- COUNT 函数返回 0。
Unmatched Rows: If a row was not matched against the pattern, it is called an unmatched row. MATCH_RECOGNIZE can be configured to return unmatched rows, too. For unmatched rows, expressions in the MEASURES subclause return NULL.

排除

The exclusion syntax ({- ... -}) in the pattern definition allows users to exclude certain rows from the output. If all matched symbols in the pattern were excluded, no row is generated for that match if ALL ROWS PER MATCH was specified. Note that the MATCH_NUMBER is incremented anyway. Excluded rows are not part of the result, but are included for the evaluation of MEASURES expressions.

使用排除语法时，可以按以下方式指定 ROWS PER MATCH 分子句：

ONE ROW PER MATCH（默认）

Returns exactly one row for each successful match. The default window function semantic for window functions in the MEASURES subclause is FINAL.

The output columns of the MATCH_RECOGNIZE operator are all expressions given in the PARTITION BY subclause and all MEASURES expressions. All resulting rows of a match are grouped by the expressions given in the PARTITION BY subclause and the MATCH_NUMBER using the ANY_VALUE aggregation function for all measures. Therefore, if measures evaluate to a different value for different rows of the same match, then the output is non-deterministic.

Omitting the PARTITION BY and MEASURES subclause results in an error indicating that the result does not include any columns.

对于空匹配项，将生成一行。不匹配的行不在输出中。

ALL ROWS PER MATCH

Returns a row for each row that is part of the match, except for rows that were matched to a portion of the pattern that was marked for exclusion.

Excluded rows are still taken into account in computations in the MEASURES subclause.

Matches might overlap based on the AFTER MATCH SKIP TO subclause, so the same row might appear multiple times in the output.

The default window function semantic for window functions in the MEASURES subclause is RUNNING.

The output columns of the MATCH_RECOGNIZE operator are the columns of the set of rows being input and the columns defined in the MEASURES subclause.

The following options are available for ALL ROWS PER MATCH:
- SHOW EMPTY MATCHES (default) Add empty matches to the output. Unmatched rows are not output.
- OMIT EMPTY MATCHES Neither empty matches nor unmatched rows are output. However, the MATCH_NUMBER is still incremented by an empty match.
- WITH UNMATCHED ROWS Adds empty matches and unmatched rows to the output. If this clause is used, then the pattern must not contain exclusions.

For an example that uses exclusion to reduce irrelevant output, see Search for Patterns in Non-Adjacent Rows.

AFTER MATCH SKIP：指定匹配后在何处继续¶

AFTER MATCH SKIP
{
    PAST LAST ROW   |
    TO NEXT ROW   |
    TO [ { FIRST | LAST} ] <symbol>
}

此分子句指定找到正匹配项后在何处继续进行匹配。

PAST LAST ROW (default)

在当前匹配项的最后一行之后继续进行匹配。

This prevents matches that contain overlapping rows. For example, if you have a stock pattern that contains 3 V shapes in a row, then PAST LAST ROW finds one W pattern, not two.

TO NEXT ROW

在当前匹配项的第一行之后继续进行匹配。

This allows matches that contain overlapping rows. For example, if you have a stock pattern that contains 3 V shapes in a row, then TO NEXT ROW finds two W patterns (the first pattern is based on the first two V shapes, and the second W shape is based on the second and third V shapes; thus both patterns contain the same V).

TO [ { FIRST | LAST } ] <symbol>

在已与给定符号匹配的第一行或最后一行（默认）处继续进行匹配。

至少需要将一行映射到给定符号，否则会引发错误。

如果这没有跳过当前匹配项的第一行，则会引发错误。

使用说明

DEFINE 和 MEASURES 子句中的表达式¶

The DEFINE and MEASURES clauses allow expressions. Those expressions can be complex and can include window functions and special navigational functions (which are a type of window function).

In most respects, expressions in DEFINE and MEASURES follow the rules for expressions elsewhere in Snowflake SQL syntax. However, there are some differences, which are described below:

Window Functions:

导航函数允许引用除当前行以外的其他行。例如，要创建定义价格下降的表达式，需要将一行中的价格与另一行中的价格进行比较。导航函数包括：

PREV( expr [ , offset [, default ] ] ) Navigate to the previous row within the current match in the MEASURES subclause.

This function is currently not available in the DEFINE subclause. Instead, you can use LAG which navigates to the previous row within the current window frame.
NEXT( expr [ , offset [ , default ] ] ) Navigate to the next row within the current window frame. This function is equivalent to LEAD.
FIRST( expr ) Navigate to the first row of the current match in the MEASURES subclause.

This function is currently not available in the DEFINE subclause. Instead, you can use FIRST_VALUE which navigates to the first row of the current window frame.
LAST( expr ) Navigate to the last row of the current window frame. This function is similar to LAST_VALUE, but for LAST the window frame is limited to the current row of the current matching attempt when LAST is used within the DEFINE subclause.

For an example that uses the navigational functions, see Returning Information About the Match.

In general, when a window function is used inside a MATCH_RECOGNIZE clause, the window function does not require its own OVER (PARTITION BY ... ORDER BY ...) clause. The window is implicitly determined by the PARTITION BY and ORDER BY of the MATCH_RECOGNIZE clause. (However, see 在 DEFINE 和 MEASURES 中使用的窗口函数的相关限制 for some exceptions.)

In general, the window frame is also derived implicitly from the current context in which the window function is being used. The lower bound of the frame is defined as described below:

In the DEFINE subclause:
The frame starts at the beginning of the current matching attempt except when using LAG, LEAD, FIRST_VALUE, and LAST_VALUE.
In the MEASURES subclause:
框架在找到的匹配项的开头开始。

The edges of the window frame can be specified by using either RUNNING or FINAL semantics.

expr ::= ... [ { RUNNING | FINAL } ] windowFunction ...
RUNNING:
通常，框架在当前行结束。但是，存在以下例外情况：

In the DEFINE subclause, for LAG, LEAD, FIRST_VALUE, LAST_VALUE, and NEXT, the frame ends at the last row of the window.

In the MEASURES subclause, for PREV, NEXT, LAG, and LEAD, the frame ends at the last row of the window.

In the DEFINE subclause, RUNNING is the default (and the only allowed) semantic.
In the MEASURES subclause, when the ALL ROWS PER MATCH subclause is used, RUNNING is the default.
FINAL:
框架在匹配项的最后一行结束。
FINAL is allowed only in the MEASURES subclause. It is the default there when ONE ROW PER MATCH applies.

Symbol Predicates:

Expressions within the DEFINE and MEASURES subclauses allow symbols as predicates for column references.

predicatedColumnReference ::= <symbol>.<column>

The <symbol> indicates a row that was matched, and the <column> identifies a specific column within that row.

带谓词的列引用意味着周围的窗口函数仅查看最终映射到指定符号的行。

Predicated column references can be used outside and inside of a window function. If used outside of a window function, <symbol>.<column> is the same as LAST(<symbol>.<column>). Inside of a window function, all column references either need to be predicated with the same symbol or are all non-predicated.

以下内容说明了与导航相关的函数在带谓词的列引用下的行为方式：

{PREV/LAG( ... <symbol>.<column> ... , <offset>)} Searches the window frame backwards starting from and including the current row (or last row in case of a FINAL semantic) for the first row that was finally mapped to the specified <symbol>, and then goes <offset> (default is 1) rows backwards, ignoring the symbol those rows were mapped to. If the searched part of the frame does not contain a row mapped to <symbol> or the search would go beyond the edge of the frame, then NULL is returned.
{NEXT/LEAD( ... <symbol>.<column> ... , <offset>)} Searches the window frame backwards starting from and including the current row (or last row in case of a FINAL semantic) for the first row that was finally mapped to the specified <symbol>, and then goes <offset> (default is 1) rows forward, ignoring the symbol those rows were mapped to. If the searched part of the frame does not contain a row mapped to <symbol> or the search would go beyond the edge of the frame, then NULL is returned.
{FIRST/FIRST_VALUE( ... <symbol>.<column> ... )} Searches the window frame forwards starting from and including the first row up to and including the current row (or last row in case of a FINAL semantic) for the first row that was finally mapped to the specified <symbol>. If the searched part of the frame does not contain a row mapped to <symbol>, NULL is returned.
{LAST/LAST_VALUE( ... <symbol>.<column> ... )} Searches the window frame backwards starting from and including the current row (or last row in case of a FINAL semantic) for the first row that was finally mapped to the specified <symbol>. If the searched part of the frame does not contain a row mapped to <symbol>, NULL is returned.

Note

Restrictions on window functions are documented in the 在 DEFINE 和 MEASURES 中使用的窗口函数的相关限制 section.

在 DEFINE 和 MEASURES 中使用的窗口函数的相关限制¶

Expressions in the DEFINE and MEASURES subclauses can include window functions. However, there are some limitations on using window functions in these subclauses. These limitations are shown in the table below:

Function DEFINE (Running) [column/symbol.column] MEASURES (Running) [column/symbol.column] MEASURES (Final) [column/symbol.column]
Column ✔ / ❌ ✔ / ❌ ✔ / ✔
PREV(…) ❌ / ❌ ✔ / ❌ ✔ / ❌
NEXT(…) ✔ / ❌ ✔ / ❌ ✔ / ❌
FIRST(…) ❌ / ❌ ✔ / ❌ ✔ / ✔
LAST(…) ✔ / ❌ ✔ / ❌ ✔ / ✔
LAG() ✔ / ❌ ✔ / ❌ ✔ / ❌
LEAD() ✔ / ❌ ✔ / ❌ ✔ / ❌
FIRST_VALUE() ✔ / ❌ ✔ / ❌ ✔ / ✔
LAST_VALUE() ✔ / ❌ ✔ / ❌ ✔ / ✔
Aggregations ^[1] ✔ / ❌ ✔ / ✔ ✔ / ✔
Other window functions ^[1] ✔ / ❌ ✔ / ❌ ✔ / ❌

Function	DEFINE (Running) [column/symbol.column]	MEASURES (Running) [column/symbol.column]	MEASURES (Final) [column/symbol.column]
Column	✔ / ❌	✔ / ❌	✔ / ✔
PREV(…)	❌ / ❌	✔ / ❌	✔ / ❌
NEXT(…)	✔ / ❌	✔ / ❌	✔ / ❌
FIRST(…)	❌ / ❌	✔ / ❌	✔ / ✔
LAST(…)	✔ / ❌	✔ / ❌	✔ / ✔
LAG()	✔ / ❌	✔ / ❌	✔ / ❌
LEAD()	✔ / ❌	✔ / ❌	✔ / ❌
FIRST_VALUE()	✔ / ❌	✔ / ❌	✔ / ✔
LAST_VALUE()	✔ / ❌	✔ / ❌	✔ / ✔
Aggregations ^[1]	✔ / ❌	✔ / ✔	✔ / ✔
Other window functions ^[1]	✔ / ❌	✔ / ❌	✔ / ❌

The MATCH_RECOGNIZE-specific functions MATCH_NUMBER(), MATCH_SEQUENCE_NUMBER(), and CLASSIFIER() are currently not available in the DEFINE subclause.

故障排除

使用 ONE ROW PER MATCH 时出现错误消息：“SELECT with no columns”¶

When you use the ONE ROW PER MATCH clause, only columns and expressions from the PARTITION BY and MEASURES subclauses are allowed in the projection clause of the SELECT. If you try to use MATCH_RECOGNIZE without either a PARTITION BY or MEASURES clause, you get an error similar to SELECT with no columns.

For more information about ONE ROW PER MATCH vs. ALL ROWS PER MATCH, see Generating One Row for Each Match vs Generating All Rows for Each Match.

示例

The topic Identifying Sequences of Rows That Match a Pattern contains many examples, including some that are simpler than most of the examples here. If you are not already familiar with MATCH_RECOGNIZE, then you might want to read those examples first.

以下一些示例使用下面的表和数据：

create table stock_price_history (company TEXT, price_date DATE, price INT);

insert into stock_price_history values
    ('ABCD', '2020-10-01', 50),
    ('XYZ' , '2020-10-01', 89),
    ('ABCD', '2020-10-02', 36),
    ('XYZ' , '2020-10-02', 24),
    ('ABCD', '2020-10-03', 39),
    ('XYZ' , '2020-10-03', 37),
    ('ABCD', '2020-10-04', 42),
    ('XYZ' , '2020-10-04', 63),
    ('ABCD', '2020-10-05', 30),
    ('XYZ' , '2020-10-05', 65),
    ('ABCD', '2020-10-06', 47),
    ('XYZ' , '2020-10-06', 56),
    ('ABCD', '2020-10-07', 71),
    ('XYZ' , '2020-10-07', 50),
    ('ABCD', '2020-10-08', 80),
    ('XYZ' , '2020-10-08', 54),
    ('ABCD', '2020-10-09', 75),
    ('XYZ' , '2020-10-09', 30),
    ('ABCD', '2020-10-10', 63),
    ('XYZ' , '2020-10-10', 32);

下图显示了曲线的形状：

Chart of Prices of Stocks "ABCD" and "XYZ"

Report one summary row for each `V` shape¶

The following query searches for all V shapes in the previously presented stock_price_history. The output is explained in more detail after the query and output.

SELECT * FROM stock_price_history
  MATCH_RECOGNIZE(
    PARTITION BY company
    ORDER BY price_date
    MEASURES
      MATCH_NUMBER() AS match_number,
      FIRST(price_date) AS start_date,
      LAST(price_date) AS end_date,
      COUNT(*) AS rows_in_sequence,
      COUNT(row_with_price_decrease.*) AS num_decreases,
      COUNT(row_with_price_increase.*) AS num_increases
    ONE ROW PER MATCH
    AFTER MATCH SKIP TO LAST row_with_price_increase
    PATTERN(row_before_decrease row_with_price_decrease+ row_with_price_increase+)
    DEFINE
      row_with_price_decrease AS price < LAG(price),
      row_with_price_increase AS price > LAG(price)
  )
ORDER BY company, match_number;
+---------+--------------+------------+------------+------------------+---------------+---------------+
| COMPANY | MATCH_NUMBER | START_DATE | END_DATE   | ROWS_IN_SEQUENCE | NUM_DECREASES | NUM_INCREASES |
|---------+--------------+------------+------------+------------------+---------------+---------------|
| ABCD    |            1 | 2020-10-01 | 2020-10-04 |                4 |             1 |             2 |
| ABCD    |            2 | 2020-10-04 | 2020-10-08 |                5 |             1 |             3 |
| XYZ     |            1 | 2020-10-01 | 2020-10-05 |                5 |             1 |             3 |
| XYZ     |            2 | 2020-10-05 | 2020-10-08 |                4 |             2 |             1 |
| XYZ     |            3 | 2020-10-08 | 2020-10-10 |                3 |             1 |             1 |
+---------+--------------+------------+------------+------------------+---------------+---------------+

输出为每个匹配项显示一行（无论匹配项中有多少行）。

输出包含以下列：

COMPANY：公司的股票代码。
The MATCH_NUMBER is a sequential number identifying which match this was within this data set (e.g. the first match has MATCH_NUMBER 1, the second match has MATCH_NUMBER 2, etc.). If the data was partitioned, then the MATCH_NUMBER is the sequential number within the partition (in this example, for each company/stock).
START_DATE：模式的此次出现的开始日期。
END_DATE：模式的此次出现的结束日期。
ROWS_IN_SEQUENCE: This is the number of rows in the match. For example, the first match is based on the prices measured on 4 days (October 1 through October 4), so ROWS_IN_SEQUENCE is 4.
NUM_DECREASES: This is the number of days (within the match) that the price went down. For example, in the first match, the price went down for 1 day and then went up for 2 days, so NUM_DECREASES is 1.
NUM_INCREASES: This is the number of days (within the match) that the price went up. For example, in the first match, the price went down for 1 day and then went up for 2 days, so NUM_INCREASES is 2.

报告一家公司的所有匹配项的所有行

此示例返回每个匹配项中的所有行（而不仅仅是每个匹配项的一个汇总行）。此模式搜索“ABCD”公司的价格上涨：

select price_date, match_number, msq, price, cl from
  (select * from stock_price_history where company='ABCD') match_recognize(
    order by price_date
    measures
        match_number() as "MATCH_NUMBER",
        match_sequence_number() as msq,
        classifier() as cl
    all rows per match
    pattern(ANY_ROW UP+)
    define
        ANY_ROW AS TRUE,
        UP as price > lag(price)
)
order by match_number, msq;
+------------+--------------+-----+-------+---------+
| PRICE_DATE | MATCH_NUMBER | MSQ | PRICE | CL      |
|------------+--------------+-----+-------+---------|
| 2020-10-02 |            1 |   1 |    36 | ANY_ROW |
| 2020-10-03 |            1 |   2 |    39 | UP      |
| 2020-10-04 |            1 |   3 |    42 | UP      |
| 2020-10-05 |            2 |   1 |    30 | ANY_ROW |
| 2020-10-06 |            2 |   2 |    47 | UP      |
| 2020-10-07 |            2 |   3 |    71 | UP      |
| 2020-10-08 |            2 |   4 |    80 | UP      |
+------------+--------------+-----+-------+---------+

省略空匹配项

此示例搜索高于公司整个图表的平均价格的价格范围。此示例省略了空匹配项。但请注意，空匹配项仍然会让 MATCH_NUMBER 递增：

select * from stock_price_history match_recognize(
    partition by company
    order by price_date
    measures
        match_number() as "MATCH_NUMBER"
    all rows per match omit empty matches
    pattern(OVERAVG*)
    define
        OVERAVG as price > avg(price) over (rows between unbounded
                                  preceding and unbounded following)
)
order by company, price_date;
+---------+------------+-------+--------------+
| COMPANY | PRICE_DATE | PRICE | MATCH_NUMBER |
|---------+------------+-------+--------------|
| ABCD    | 2020-10-07 |    71 |            7 |
| ABCD    | 2020-10-08 |    80 |            7 |
| ABCD    | 2020-10-09 |    75 |            7 |
| ABCD    | 2020-10-10 |    63 |            7 |
| XYZ     | 2020-10-01 |    89 |            1 |
| XYZ     | 2020-10-04 |    63 |            4 |
| XYZ     | 2020-10-05 |    65 |            4 |
| XYZ     | 2020-10-06 |    56 |            4 |
| XYZ     | 2020-10-08 |    54 |            6 |
+---------+------------+-------+--------------+

演示 WITH UNMATCHED ROWS 选项¶

This example demonstrates the WITH UNMATCHED ROWS option. Like the 省略空匹配项 example above, this example searches for price ranges above the average price of each company’s chart. Note that the quantifier in this query is +, while the quantifier in the previous query was *:

select * from stock_price_history match_recognize(
    partition by company
    order by price_date
    measures
        match_number() as "MATCH_NUMBER",
        classifier() as cl
    all rows per match with unmatched rows
    pattern(OVERAVG+)
    define
        OVERAVG as price > avg(price) over (rows between unbounded
                                 preceding and unbounded following)
)
order by company, price_date;
+---------+------------+-------+--------------+---------+
| COMPANY | PRICE_DATE | PRICE | MATCH_NUMBER | CL      |
|---------+------------+-------+--------------+---------|
| ABCD    | 2020-10-01 |    50 |         NULL | NULL    |
| ABCD    | 2020-10-02 |    36 |         NULL | NULL    |
| ABCD    | 2020-10-03 |    39 |         NULL | NULL    |
| ABCD    | 2020-10-04 |    42 |         NULL | NULL    |
| ABCD    | 2020-10-05 |    30 |         NULL | NULL    |
| ABCD    | 2020-10-06 |    47 |         NULL | NULL    |
| ABCD    | 2020-10-07 |    71 |            1 | OVERAVG |
| ABCD    | 2020-10-08 |    80 |            1 | OVERAVG |
| ABCD    | 2020-10-09 |    75 |            1 | OVERAVG |
| ABCD    | 2020-10-10 |    63 |            1 | OVERAVG |
| XYZ     | 2020-10-01 |    89 |            1 | OVERAVG |
| XYZ     | 2020-10-02 |    24 |         NULL | NULL    |
| XYZ     | 2020-10-03 |    37 |         NULL | NULL    |
| XYZ     | 2020-10-04 |    63 |            2 | OVERAVG |
| XYZ     | 2020-10-05 |    65 |            2 | OVERAVG |
| XYZ     | 2020-10-06 |    56 |            2 | OVERAVG |
| XYZ     | 2020-10-07 |    50 |         NULL | NULL    |
| XYZ     | 2020-10-08 |    54 |            3 | OVERAVG |
| XYZ     | 2020-10-09 |    30 |         NULL | NULL    |
| XYZ     | 2020-10-10 |    32 |         NULL | NULL    |
+---------+------------+-------+--------------+---------+

在 MEASURES 子句中演示符号谓词¶

This example shows the use of <symbol>.<column> notation with symbol predicates:

SELECT company, price_date, price, "FINAL FIRST(LT45.price)", "FINAL LAST(LT45.price)"
    FROM stock_price_history
       MATCH_RECOGNIZE (
           PARTITION BY company
           ORDER BY price_date
           MEASURES
               FINAL FIRST(LT45.price) AS "FINAL FIRST(LT45.price)",
               FINAL LAST(LT45.price)  AS "FINAL LAST(LT45.price)"
           ALL ROWS PER MATCH
           AFTER MATCH SKIP PAST LAST ROW
           PATTERN (LT45 LT45)
           DEFINE
               LT45 AS price < 45.00
           )
    WHERE company = 'ABCD'
    ORDER BY price_date;
+---------+------------+-------+-------------------------+------------------------+
| COMPANY | PRICE_DATE | PRICE | FINAL FIRST(LT45.price) | FINAL LAST(LT45.price) |
|---------+------------+-------+-------------------------+------------------------|
| ABCD    | 2020-10-02 |    36 |                      36 |                     39 |
| ABCD    | 2020-10-03 |    39 |                      36 |                     39 |
| ABCD    | 2020-10-04 |    42 |                      42 |                     30 |
| ABCD    | 2020-10-05 |    30 |                      42 |                     30 |
+---------+------------+-------+-------------------------+------------------------+

MATCH_RECOGNIZE¶

语法

必需的分子句

DEFINE：定义符号¶

PATTERN：指定要匹配的模式¶

可选分子句

ORDER BY：匹配前对行进行排序¶

PARTITION BY：将行分区到窗口¶

MEASURES：指定其他输出列¶

ROW(S) PER MATCH：指定要返回的行¶

AFTER MATCH SKIP：指定匹配后在何处继续¶

使用说明

DEFINE 和 MEASURES 子句中的表达式¶

在 DEFINE 和 MEASURES 中使用的窗口函数的相关限制¶

故障排除

使用 ONE ROW PER MATCH 时出现错误消息：“SELECT with no columns”¶

示例

Report one summary row for each V shape¶

报告一家公司的所有匹配项的所有行

省略空匹配项

演示 WITH UNMATCHED ROWS 选项¶

在 MEASURES 子句中演示符号谓词¶

Report one summary row for each `V` shape¶