GET_QUERY_OPERATOR_STATS¶
返回有关已完成查询中各个查询运算符的统计信息。您可以对过去 14 天内执行的任何已完成的查询运行此函数。
You can use this information to understand the structure of a query and identify query operators --- for example, the join operator --- that cause performance problems.
For example, you can use this information to determine which operators are consuming the most resources. As another example, you can use this function to identify joins that have more output rows than input rows, which can be a sign of an "exploding" join; for example, an unintended Cartesian product.
Snowsight 中的 查询配置文件 选项卡中也提供了这些统计信息。GET_QUERY_OPERATOR_STATS() 函数通过编程接口提供相同的信息。
有关查找有问题的查询运算符的更多信息,请参阅 查询配置文件识别的常见查询问题。
语法¶
GET_QUERY_OPERATOR_STATS( <query_id> )
实参¶
query_id查询的 ID。您可以使用:
字符串字面量(由单引号括起来的字符串)。
包含查询 ID 的 会话变量。
LAST_QUERY_ID 函数调用的返回值。
返回¶
GET_QUERY_OPERATOR_STATS 函数是一个 表函数。该函数返回包含有关查询中每个查询运算符的统计信息的行。有关更多信息,请参阅以下章节:使用说明 和 输出。
使用说明¶
This function returns statistics only for queries that have completed.
您必须具有运行查询的仓库的 OPERATE 或 MONITOR 权限。
This function provides detailed statistics about each query operator used in the specified query. The following list shows the possible query operators:
Aggregate:对输入进行分组并计算聚合函数。
CartesianJoin:一种特殊类型的联接。
Delete:从表中移除记录。
ExternalFunction:表示外部函数执行的处理。
ExternalScan:表示对存储在暂存区对象中的数据的访问。
Filter:表示筛选行的操作。
Flatten:处理 VARIANT 记录,可能在指定路径上展平记录。
Generator: Generates records by using the TABLE(GENERATOR(...)) construct.
GroupingSets: Represents constructs, such as GROUPING SETS, ROLLUP, and CUBE.
Insert:通过 INSERT 或 COPY 操作将记录添加到表中。
InternalObject: Represents access to an internal data object; for example, in an Information Schema or the result of a previous query.
Join:在给定条件下组合两个输入。
JoinFilter:特殊筛选操作,用于移除识别为可能不符合查询计划中进一步联接条件的元组。
Merge:对表执行 MERGE 操作。
Pivot:将唯一值从一列转换为多列,并进行任何必要的聚合。
Result:返回查询结果。
Sort:对给定表达式的输入进行排序。
SortWithLimit:排序后生成输入序列的一部分,通常是
ORDER BY ... LIMIT ... OFFSET ...构造的结果。TableScan:表示对单个表的访问。
UnionAll:连接两个输入。
Unload:表示将数据从表导出到暂存区中的文件的 COPY 操作。
Unpivot:通过将列转换为行来旋转表。
Update:更新表中的记录。
ValuesClause:随 VALUES 子句提供的值列表。
WindowFunction:计算窗口函数。
WithClause:位于 SELECT 语句正文之前,并定义一个或多个 CTEs。
WithReference:WITH 子句的实例。
信息以表的形式返回。表中的每一行对应一个运算符。行包含该运算符的执行细分和查询统计信息。
该行还可以列出 操作符属性 (这些属性取决于操作符的类型)。
Statistics that break down query execution time are expressed as a percentage of the total query execution time.
有关特定统计信息的更多信息,请参阅 输出 (本主题内容)。
因为此函数是表函数,所以必须在 FROM 子句中使用它,并且必须将它封装在
TABLE()中。例如:SELECT * FROM TABLE(GET_QUERY_OPERATOR_STATS(last_query_id()));
输出¶
该函数返回以下列:
Column name |
Data type |
描述 |
|---|---|---|
QUERY_ID |
VARCHAR |
查询 ID,它是 SQL 语句的内部系统生成的标识符。 |
STEP_ID |
NUMBER(38, 0) |
查询计划中步骤的标识符。 |
OPERATOR_ID |
NUMBER(38, 0) |
运算符的标识符。这在查询中是唯一的。值从 0 开始。 |
PARENT_OPERATORS |
包含一个或多个 NUMBER(38, 0) 的 ARRAY |
此运算符的父运算符的标识符;如果这是查询计划中的最后一个运算符(通常是 Result 运算符),则值为 NULL。 |
OPERATOR_TYPE |
VARCHAR |
The type of query operator; for example, |
包含 OBJECT 的 VARIANT |
Statistics about the operator (for example, the number of output rows from the operator). |
|
包含 OBJECT 的 VARIANT |
有关运算符执行时间的信息。 |
|
包含 OBJECT 的 VARIANT |
Information about the operator. This information depends on the operator type. |
如果运算符的特定列没有信息,则值为 NULL。
Three of these columns contain OBJECTs. Each object contains key/value pairs. The tables below describe the keys in these objects.
OPERATOR_STATISTICS¶
OPERATOR_STATISTICS 列的 OBJECTs 中的字段提供有关运算符的其他信息。这些信息可以包括:
键 |
Nested key (if applicable) |
Data type |
描述 |
|---|---|---|---|
|
Statistics for Data Manipulation Language (DML) queries. |
||
|
DOUBLE |
Number of rows inserted into a table or tables. |
|
|
DOUBLE |
Number of rows updated in a table. |
|
|
DOUBLE |
Number of rows deleted from a table. |
|
|
DOUBLE |
Number of rows unloaded during data export. |
|
|
Information about calls to extension functions. If the value of a field is zero, then the field is not displayed. |
||
|
DOUBLE |
Amount of time for the Java UDF handler to load. |
|
|
DOUBLE |
Number of times the Java UDF handler is invoked. |
|
|
DOUBLE |
Maximum amount of time for the Java UDF handler to execute. |
|
|
DOUBLE |
Average amount of time to execute the Java UDF handler. |
|
|
DOUBLE |
Number of times the Java UDTF process method was invoked. |
|
|
DOUBLE |
Amount of time to execute the Java UDTF process. |
|
|
DOUBLE |
Average amount of time to execute the Java UDTF process. |
|
|
DOUBLE |
Number of times the Java UDTF constructor was invoked. |
|
|
DOUBLE |
Amount of time to execute the Java UDTF constructor. |
|
|
DOUBLE |
Average amount of time to execute the Java UDTF constructor. |
|
|
DOUBLE |
Number of times the Java UDTF endPartition method was invoked. |
|
|
DOUBLE |
Amount of time to execute the Java UDTF endPartition method. |
|
|
DOUBLE |
Average amount of time to execute the Java UDTF |
|
|
DOUBLE |
Maximum amount of time to download the Java UDF dependencies. |
|
|
DOUBLE |
Peak memory usage as reported by the JVM. |
|
|
DOUBLE |
Compile time for the Java UDF inline code. |
|
|
DOUBLE |
Number of times the Python UDF handler was invoked. |
|
|
DOUBLE |
Total execution time for the Python UDF handler. |
|
|
DOUBLE |
Average amount of time to execute the Python UDF handler. |
|
|
DOUBLE |
Peak memory usage by the Python sandbox environment. |
|
|
DOUBLE |
Average amount of time to create the Python environment, including downloading and installing packages. |
|
|
DOUBLE |
Amount of time to run the Conda solver to solve Python packages. |
|
|
DOUBLE |
Amount of time to create the Python environment. |
|
|
DOUBLE |
Amount of time to initialize the Python UDF. |
|
|
DOUBLE |
Number of external file bytes read for UDFs. |
|
|
DOUBLE |
Number of external files accessed for UDFs. |
|
|
Information about calls to external functions. If the value of a field --- for example
|
||
|
DOUBLE |
Number of times that an external function was called. This number can be different from the number of external function calls in the text of the SQL statement because of the number of batches that rows are divided into, the number of retries if there are transient network problems, and so on. |
|
|
DOUBLE |
Number of rows sent to external functions. |
|
|
DOUBLE |
Number of rows received back from external functions. |
|
|
DOUBLE |
Number of bytes sent to external functions. If the key includes |
|
|
DOUBLE |
Number of bytes received from external functions. If the key includes |
|
|
DOUBLE |
Number of retries because of transient errors. |
|
|
DOUBLE |
Average amount of time per invocation (call) in milliseconds between the time Snowflake sent the data and received the returned data. |
|
|
INTEGER |
返回 4xx 状态代码的 HTTP 请求的总数。 |
|
|
INTEGER |
返回 5xx 状态代码的 HTTP 请求的总数。 |
|
|
DOUBLE |
成功 HTTP 请求的平均延迟。 |
|
|
DOUBLE |
Average overhead per successful request because of a slowdown caused by throttling (HTTP 429). |
|
|
DOUBLE |
Number of batches that were retried because of HTTP 429 errors. |
|
|
DOUBLE |
成功 HTTP 请求的第 50 个百分位延迟。在所有成功请求中,有 50% 的请求完成时间少于此时间。 |
|
|
DOUBLE |
成功 HTTP 请求的第 90 个百分位延迟。在所有成功请求中,有 90% 的请求完成时间少于此时间。 |
|
|
DOUBLE |
成功 HTTP 请求的第 95 个百分位延迟。在所有成功请求中,有 95% 的请求完成时间少于此时间。 |
|
|
DOUBLE |
成功 HTTP 请求的第 99 个百分位延迟。在所有成功请求中,有 99% 的请求完成时间少于此时间。 |
|
|
INTEGER |
Number of input rows. This can be missing for an operator with no input edges from other operators. |
|
|
有关查询期间执行的 I/O(输入/输出)操作的信息: |
||
|
DOUBLE |
Percentage of data scanned for a given table so far. |
|
|
DOUBLE |
Number of bytes scanned so far. |
|
|
DOUBLE |
Percentage of data scanned from the local disk cache. |
|
|
DOUBLE |
Bytes written; for example, when loading into a table. |
|
|
DOUBLE |
写入结果对象的字节数。 For example, 通常,结果对象表示作为查询结果生成的任何内容, |
|
|
DOUBLE |
从结果对象中读取的字节数。 |
|
|
DOUBLE |
Bytes read from an external object; for example, a stage. |
|
|
|
DOUBLE |
Amount of data sent over the network. |
|
INTEGER |
Number of output rows. This can be missing for the operator that returns the results to the user; which is usually the RESULT operator. |
|
|
有关表修剪的信息。 |
||
|
DOUBLE |
Number of partitions pruned by Snowflake Optima. |
|
|
DOUBLE |
Number of partitions scanned so far. |
|
|
DOUBLE |
Total number of partitions in a given table. |
|
|
有关中间结果不适合内存的操作的磁盘使用情况的信息。 |
||
|
DOUBLE |
Volume of data spilled to remote disk. |
|
|
DOUBLE |
Volume of data spilled to local disk. |
|
|
有关使用 搜索优化服务 的查询信息。 |
||
|
DOUBLE |
Number of partitions pruned by search optimization. |
|
|
DOUBLE |
通过搜索优化和 Snowflake Optima 修剪的分区数。 |
EXECUTION_TIME_BREAKDOWN¶
EXECUTION_TIME_BREAKDOWN 列的 OBJECTs 中的字段如下所示。
键 |
Data type |
描述 |
|---|---|---|
|
DOUBLE |
Percentage of the total query time spent by this operator. |
|
DOUBLE |
设置查询处理所用的时间。 |
|
DOUBLE |
CPU 处理数据所用的时间。 |
|
DOUBLE |
同步参与进程之间的活动所用的时间。 |
|
DOUBLE |
等待本地磁盘访问时阻止处理的时间。 |
|
DOUBLE |
等待远程磁盘访问时阻止处理的时间。 |
|
DOUBLE |
处理等待网络数据传输的时间。 |
OPERATOR_ATTRIBUTES¶
Each output row describes one operator in the query. The following table shows the possible types of operators; for example, the Filter operator. For each type of operator, the table shows the possible attributes; for example, the expression used to filter the rows.
运算符属性存储在 OPERATOR_ATTRIBUTES 列中,该列的类型为 VARIANT 并且包含 OBJECT。OBJECT 包含键值对。每个键对应运算符的一个属性。
Operator name |
键 |
Data type |
描述 |
|---|---|---|---|
|
|||
|
VARCHAR 的 ARRAY |
计算的函数列表。 |
|
|
VARCHAR 的 ARRAY |
Group-by expression. |
|
|
|||
|
VARCHAR |
非相等联接表达式。 |
|
|
VARCHAR |
相等联接表达式。 |
|
|
VARCHAR |
联接 (INNER) 的类型。 |
|
|
|
VARCHAR |
更新表的名称。 |
|
|||
|
VARCHAR |
从中读取数据的暂存区的名称。 |
|
|
VARCHAR |
暂存区的类型。 |
|
|
|
VARCHAR |
用于筛选数据的表达式。 |
|
|
VARCHAR |
Input expression used to flatten data. |
|
|||
|
NUMBER |
输入参数 ROWCOUNT 的值。 |
|
|
NUMBER |
输入参数 TIMELIMIT 的值。 |
|
|
|||
|
VARCHAR 的 ARRAY |
计算的函数列表。 |
|
|
VARCHAR 的 ARRAY |
分组集的列表。 |
|
|
|||
|
VARCHAR |
插入了哪些表达式。 |
|
|
VARCHAR 的 ARRAY |
添加记录的表名列表。 |
|
|
|
VARCHAR |
Name of the accessed object. |
|
|||
|
VARCHAR |
非相等联接表达式。 |
|
|
VARCHAR |
相等联接表达式。 |
|
|
VARCHAR |
联接的类型(INNER、OUTER、LEFT JOIN 等)。 |
|
|
|
NUMBER |
Operator id of the join used to identify tuples that can be filtered out. |
|
|
VARCHAR |
更新表的名称。 |
|
|||
|
VARCHAR 的 ARRAY |
聚合结果的其余列。 |
|
|
VARCHAR 的 ARRAY |
生成的透视值列。 |
|
|
|
VARCHAR 的 ARRAY |
生成的表达式列表。 |
|
|
VARCHAR 的 ARRAY |
定义排序顺序的表达式。 |
|
|||
|
NUMBER |
生成的元组在已排列序列中的发出位置。 |
|
|
NUMBER |
生成的行数。 |
|
|
VARCHAR 的 ARRAY |
定义排序顺序的表达式。 |
|
|
|||
|
VARCHAR 的 ARRAY |
已扫描列的列表 |
|
|
VARCHAR 的 ARRAY |
从变体列提取的路径列表。 |
|
|
VARCHAR |
正在访问的表的别名。 |
|
|
VARCHAR |
正在访问的表的名称。 |
|
|
|
VARCHAR |
保存数据的暂存区。 |
|
|
VARCHAR 的 ARRAY |
取消透视查询的输出列。 |
|
|
VARCHAR |
更新表的名称。 |
|
|||
|
NUMBER |
生成的值的数量。 |
|
|
VARCHAR |
值的列表。 |
|
|
|
VARCHAR 的 ARRAY |
计算的函数列表。 |
|
|
VARCHAR |
WITH 子句的别名。 |
如果未列出运算符,则不生成任何属性,并将值报告为 {}。
备注
The following operators do not have any operator attributes and therefore are not included in the table of
OPERATOR_ATTRIBUTES:UnionAllExternalFunction
示例¶
以下示例调用 GET_QUERY_OPERATOR_STATS 函数。
检索关于单个查询的数据¶
此示例显示了联接两个小表的 SELECT 的统计信息。
运行 SELECT 语句:
SELECT x1.i, x2.i
FROM x1 INNER JOIN x2 ON x2.i = x1.i
ORDER BY x1.i, x2.i;
获取查询 ID:
SET lqid = (SELECT LAST_QUERY_ID());
调用 GET_QUERY_OPERATOR_STATS() 获取有关查询中各个查询运算符的统计信息:
SELECT * FROM TABLE(GET_QUERY_OPERATOR_STATS($lqid));
+--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------+
| QUERY_ID | STEP_ID | OPERATOR_ID | PARENT_OPERATORS | OPERATOR_TYPE | OPERATOR_STATISTICS | EXECUTION_TIME_BREAKDOWN | OPERATOR_ATTRIBUTES |
|--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------|
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 0 | NULL | Result | { | { | { |
| | | | | | "input_rows": 64 | "overall_percentage": 0.000000000000000e+00 | "expressions": [ |
| | | | | | } | } | "X1.I", |
| | | | | | | | "X2.I" |
| | | | | | | | ] |
| | | | | | | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 1 | [ 0 ] | Sort | { | { | { |
| | | | | | "input_rows": 64, | "overall_percentage": 0.000000000000000e+00 | "sort_keys": [ |
| | | | | | "output_rows": 64 | } | "X1.I ASC NULLS LAST", |
| | | | | | } | | "X2.I ASC NULLS LAST" |
| | | | | | | | ] |
| | | | | | | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 2 | [ 1 ] | Join | { | { | { |
| | | | | | "input_rows": 128, | "overall_percentage": 0.000000000000000e+00 | "equality_join_condition": "(X2.I = X1.I)", |
| | | | | | "output_rows": 64 | } | "join_type": "INNER" |
| | | | | | } | | } |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 3 | [ 2 ] | TableScan | { | { | { |
| | | | | | "io": { | "overall_percentage": 0.000000000000000e+00 | "columns": [ |
| | | | | | "bytes_scanned": 1024, | } | "I" |
| | | | | | "percentage_scanned_from_cache": 1, | | ], |
| | | | | | "scan_progress": 1 | | "table_name": "MY_DB.MY_SCHEMA.X2" |
| | | | | | }, | | } |
| | | | | | "output_rows": 64, | | |
| | | | | | "pruning": { | | |
| | | | | | "partitions_scanned": 1, | | |
| | | | | | "partitions_total": 1 | | |
| | | | | | } | | |
| | | | | | } | | |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 4 | [ 2 ] | JoinFilter | { | { | { |
| | | | | | "input_rows": 64, | "overall_percentage": 0.000000000000000e+00 | "join_id": "2" |
| | | | | | "output_rows": 64 | } | } |
| | | | | | } | | |
| 01a8f330-0507-3f5b-0000-43830248e09a | 1 | 5 | [ 4 ] | TableScan | { | { | { |
| | | | | | "io": { | "overall_percentage": 0.000000000000000e+00 | "columns": [ |
| | | | | | "bytes_scanned": 1024, | } | "I" |
| | | | | | "percentage_scanned_from_cache": 1, | | ], |
| | | | | | "scan_progress": 1 | | "table_name": "MY_DB.MY_SCHEMA.X1" |
| | | | | | }, | | } |
| | | | | | "output_rows": 64, | | |
| | | | | | "pruning": { | | |
| | | | | | "partitions_scanned": 1, | | |
| | | | | | "partitions_total": 1 | | |
| | | | | | } | | |
| | | | | | } | | |
+--------------------------------------+---------+-------------+--------------------+---------------+-----------------------------------------+-----------------------------------------------+----------------------------------------------------------------------+
识别“分解”联接运算符¶
以下示例演示了如何使用 GET_QUERY_OPERATOR_STATS 检查复杂查询。此示例在查询中查找生成的行数比输入到该运算符的行数多的运算符。
这是要分析的查询:
SELECT *
FROM t1
JOIN t2 ON t1.a = t2.a
JOIN t3 ON t1.b = t3.b
JOIN t4 ON t1.c = t4.c;
获取上一个查询的查询 ID :
SET lid = LAST_QUERY_ID();
The following query shows the ratio of output rows to input rows for each of the join operators in the query:
SELECT operator_id,
operator_attributes,
operator_statistics:output_rows / operator_statistics:input_rows AS row_multiple
FROM TABLE(GET_QUERY_OPERATOR_STATS($lid))
WHERE operator_type = 'Join'
ORDER BY step_id, operator_id;
+---------+-------------+--------------------------------------------------------------------------+---------------+
| STEP_ID | OPERATOR_ID | OPERATOR_ATTRIBUTES | ROW_MULTIPLE |
+---------+-------------+--------------------------------------------------------------------------+---------------+
| 1 | 1 | { "equality_join_condition": "(T4.C = T1.C)", "join_type": "INNER" } | 49.969249692 |
| 1 | 3 | { "equality_join_condition": "(T3.B = T1.B)", "join_type": "INNER" } | 116.071428571 |
| 1 | 5 | { "equality_join_condition": "(T2.A = T1.A)", "join_type": "INNER" } | 12.20657277 |
+---------+-------------+--------------------------------------------------------------------------+---------------+
识别分解联接后,可以查看每个联接条件以验证条件是否正确。