Schema:

ACCOUNT_USAGE

AGGREGATE_QUERY_HISTORY 视图

此 Account Usage 视图使您能够监控和跟踪一段时间内的语句执行情况。它包含与 QUERY_HISTORY 视图类似的数据,但以一分钟为间隔,对重复的 SQL 语句进行汇总。您可以使用此视图来监控工作负载并分析性能。

除了对混合表进行的查询外,您在 Snowflake 中执行的所有查询都包含在 AGGREGATE_QUERY_HISTORY 中。然而,AGGREGATE_QUERY_HISTORY 对于监控和分析在高吞吐量下重复执行少量不同语句的 Unistore 工作负载而言特别有用。

Column NameData TypeDescription
CALLSNUMBERNumber of times the statement (query + query plan) was executed in the aggregation interval.
INTERVAL_START_TIMETIMESTAMP_LTZStart time of the window of measurement (in the local time zone).
INTERVAL_END_TIMETIMESTAMP_LTZEnd time of the window of measurement (in the local time zone).
QUERY_PARAMETERIZED_HASHTEXTUnique ID to identify identical parameterized queries. See QUERY_PARAMETERIZED_HASH 列.
QUERY_TEXTTEXTSample text of the SQL statement.
DATABASE_IDNUMBERInternal/system-generated identifier for the database that was in use.
DATABASE_NAMETEXTDatabase that was in use at the time of the query.
SCHEMA_IDNUMBERInternal/system-generated identifier for the schema that was in use.
SCHEMA_NAMETEXTSchema that was in use at the time of the query.
QUERY_TYPETEXTDML, query, etc. If the query failed, then the query type may be UNKNOWN.
SESSION_IDNUMBERSession that executed the statement.
USER_NAMETEXTUser who issued the query.
ROLE_NAMETEXTRole that was active in the session at the time of the query.
ROLE_TYPETEXTSpecifies APPLICATION, DATABASE_ROLE, or ROLE that executed the query.
WAREHOUSE_IDNUMBERInternal/system-generated identifier for the warehouse that was used.
WAREHOUSE_NAMETEXTWarehouse that the query executed on, if any.
WAREHOUSE_SIZETEXTSize of the warehouse when this statement executed.
WAREHOUSE_TYPETEXTType of the warehouse when this statement executed.
QUERY_TAGTEXTQuery tag set for this statement through the QUERY_TAG session parameter.
IS_CLIENT_GENERATED_STATEMENTBOOLEANIndicates whether the query was client-generated.
RELEASE_VERSIONTEXTRelease version in the format of major_release.minor_release.patch_release.
ERRORSARRAYList of error codes and messages that occurred during the aggregation interval. Each error is in the format of {"code": "code1", "message": "msg1", "count": 10}.
TOTAL_ELAPSED_TIMEOBJECTElapsed time (in milliseconds).
BYTES_SCANNEDOBJECTNumber of bytes scanned by this statement.
PERCENTAGE_SCANNED_FROM_CACHEOBJECTThe percentage of data scanned from the local disk cache. The value ranges from 0.0 to 1.0. Multiply by 100 to get a true percentage.
BYTES_WRITTENOBJECTNumber of bytes written (e.g. when loading into a table).
BYTES_WRITTEN_TO_RESULTOBJECTNumber of bytes written to a result object. For example, select * from . . . would produce a set of results in tabular format representing each field in the selection.

In general, the results object represents whatever is produced as a result of the query, and BYTES_WRITTEN_TO_RESULT represents the size of the returned result.
BYTES_READ_FROM_RESULTOBJECTNumber of bytes read from a result object.
ROWS_PRODUCEDOBJECTNumber of rows produced by this statement.
ROWS_INSERTEDOBJECTNumber of rows inserted by the query.
ROWS_UPDATEDOBJECTNumber of rows updated by the query.
ROWS_DELETEDOBJECTNumber of rows deleted by the query.
ROWS_UNLOADEDOBJECTNumber of rows unloaded during data export.
BYTES_DELETEDOBJECTNumber of bytes deleted by the query.
PARTITIONS_SCANNEDOBJECTNumber of micro-partitions scanned.
PARTITIONS_TOTALOBJECTTotal micro-partitions of all tables included in this query.
BYTES_SPILLED_TO_LOCAL_STORAGEOBJECTVolume of data spilled to local disk.
BYTES_SPILLED_TO_REMOTE_STORAGEOBJECTVolume of data spilled to remote disk.
BYTES_SENT_OVER_THE_NETWORKOBJECTVolume of data sent over the network.
COMPILATION_TIMEOBJECTCompilation time (in milliseconds).
EXECUTION_TIMEOBJECTExecution time (in milliseconds).
QUEUED_PROVISIONING_TIMEOBJECTTime (in milliseconds) spent in the warehouse queue, waiting for the warehouse compute resources to provision, due to warehouse creation, resume, or resize.
QUEUED_REPAIR_TIMEOBJECTTime (in milliseconds) spent in the warehouse queue, waiting for compute resources in the warehouse to be repaired.
QUEUED_OVERLOAD_TIMEOBJECTTime (in milliseconds) spent in the warehouse queue, due to the warehouse being overloaded by the current query workload.
TRANSACTION_BLOCKED_TIMEOBJECTTime (in milliseconds) spent blocked by a concurrent DML.
OUTBOUND_DATA_TRANSFER_CLOUDTEXTTarget cloud provider for statements that unload data to another region and/or cloud.
OUTBOUND_DATA_TRANSFER_REGIONTEXTTarget region for statements that unload data to another region and/or cloud.
OUTBOUND_DATA_TRANSFER_BYTESOBJECTNumber of bytes transferred in statements that unload data to another region and/or cloud.
INBOUND_DATA_TRANSFER_CLOUDTEXTSource cloud provider for statements that load data from another region and/or cloud.
INBOUND_DATA_TRANSFER_REGIONTEXTSource region for statements that load data from another region and/or cloud.
INBOUND_DATA_TRANSFER_BYTESOBJECTNumber of bytes transferred in a replication operation from another account. The source account could be in the same region or a different region than the current account.
LIST_EXTERNAL_FILES_TIMEOBJECTTime (in milliseconds) spent listing external files.
CREDITS_USED_CLOUD_SERVICESOBJECTNumber of credits used for cloud services.
EXTERNAL_FUNCTION_TOTAL_INVOCATIONSOBJECTAggregate number of times that this query called remote services. For important details, see the Usage Notes.
EXTERNAL_FUNCTION_TOTAL_SENT_ROWSOBJECTTotal number of rows that this query sent in all calls to all remote services.
EXTERNAL_FUNCTION_TOTAL_RECEIVED_ROWSOBJECTTotal number of rows that this query received from all calls to all remote services.
EXTERNAL_FUNCTION_TOTAL_SENT_BYTESOBJECTTotal number of bytes that this query sent in all calls to all remote services.
EXTERNAL_FUNCTION_TOTAL_RECEIVED_BYTESOBJECTTotal number of bytes that this query received from all calls to all remote services.
QUERY_LOAD_PERCENTOBJECTThe approximate percentage of active compute resources in the warehouse for this query execution.
QUERY_ACCELERATION_BYTES_SCANNEDOBJECTNumber of bytes scanned by the query acceleration service.
QUERY_ACCELERATION_PARTITIONS_SCANNEDOBJECTNumber of partitions scanned by the query acceleration service.
QUERY_ACCELERATION_UPPER_LIMIT_SCALE_FACTOROBJECTUpper limit scale factor that a query would have benefited from.
CHILD_QUERIES_WAIT_TIMEOBJECTTime (in milliseconds) to complete the cached lookup when calling a memoizable function.
HYBRID_TABLE_REQUESTS_THROTTLED_COUNTNUMBERNumber of hybrid table queries that were throttled.

OBJECT 数据类型包含以下字段:

Field NameDescription
sumSum across all executions within the aggregation interval.
avgAverage across all executions within the aggregation interval.
stddevStandard deviation across all executions within the aggregation interval.
minMinimum across all executions within the aggregation interval.
medianMedian across all executions within the aggregation interval.
p9090th percentile across all executions within the aggregation interval.
p9999th percentile across all executions within the aggregation interval.
p99.999.9th percentile across all executions within the aggregation interval.
maxMaximum across all executions within the aggregation interval.

Note

The following columns of the type OBJECT do not contain a sum field:

  • PERCENTAGE_SCANNED_FROM_CACHE
  • QUERY_LOAD_PERCENT
  • QUERY_ACCELERATION_UPPER_LIMIT_SCALE_FACTOR

QUERY_PARAMETERIZED_HASH 列

QUERY_PARAMETERIZED_HASH 列包含根据参数化查询(即参数化所有字面量后的查询版本)计算的哈希值。

例如,以下查询具有相同的 QUERY_PARAMETERIZED_HASH 值:

SELECT * FROM table1 WHERE table1.name = 'TIM'
SELECT * FROM table1 WHERE table1.name = 'AIHUA'

QUERY_PARAMETERIZED_HASH 值具有以下限制:

  • 常数字面量必须是以下谓词的二元函数:等于、不等于、大于(或等于)、小于(或等于)。
  • 别名必须相同。

只要 SQL 文本存在差异,QUERY_HASH 和 QUERY_PARAMETERIZED_HASH 的值就会不同,但以下情况除外:

  • 标识符/会话变量/暂存区名称不区分大小写。
  • 空格差异被忽略。
  • 满足上述二元谓词规则的字面量。

使用说明

视图的延迟时间最长可达 180 分钟(3 小时)。

示例

您可以使用 AGGREGATE_QUERY_HISTORY 视图来监控可能存在的错误、排队、锁定阻塞或混合表限制等问题。您通常希望这些指标始终处于较低水平。如果您看到这些指标中的任何一个出现峰值,则可能表明存在问题:

SET (START_DATE, END_DATE) = ('2023-11-01', '2023-11-08');

WITH time_issues AS
(
    SELECT
        interval_start_time
        , SUM(transaction_blocked_time:"sum") AS transaction_blocked_time
        , SUM(queued_provisioning_time:"sum") AS queued_provisioning_time
        , SUM(queued_repair_time:"sum") AS queued_repair_time
        , SUM(queued_overload_time:"sum") AS queued_overload_time
        , SUM(hybrid_table_requests_throttled_count) AS hybrid_table_requests_throttled_count
    FROM snowflake.account_usage.aggregate_query_history
    WHERE TRUE
        AND interval_start_time > $START_DATE
        AND interval_start_time < $END_DATE
    GROUP BY ALL
),
errors AS
(
    SELECT
        interval_start_time
        , SUM(value:"count") as error_count
    FROM
    (
        SELECT
            a.interval_start_time
            , e.*
        FROM
            snowflake.account_usage.aggregate_query_history a,
            TABLE(FLATTEN(input => errors)) e
        WHERE TRUE
            AND interval_start_time > $START_DATE
            AND interval_start_time < $END_DATE
    )
    GROUP BY ALL
)
SELECT
    time_issues.interval_start_time
    , error_count
    , transaction_blocked_time
    , queued_provisioning_time
    , queued_repair_time
    , queued_overload_time
    , hybrid_table_requests_throttled_count
FROM
    time_issues FULL JOIN errors ON errors.interval_start_time = time_issues.interval_start_time
;

您可以查询视图以监控整体工作负载吞吐量和并发性。许多工作负载具有规律的周期性模式。任何意外的峰值或下降都可能值得调查。

For example, monitor throughput and concurrency for warehouse my_warehouse in the first week of November:

SELECT
    interval_start_time
    , SUM(calls) AS execution_count
    , SUM(calls) / 60 AS queries_per_second
    , COUNT(DISTINCT session_id) AS unique_sessions
    , COUNT(user_name) AS unique_users
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
    AND warehouse_name = 'MY_WAREHOUSE'
    AND interval_start_time > '2023-11-01'
    AND interval_start_time < '2023-11-08'
GROUP BY
    interval_start_time
;

对于最常用且重复次数最多的查询,最值得集中精力进行优化或提高工作负载效率。您可以查询视图,按执行计数确定一个工作负载的热门查询。

For example, identify the top queries by execution count for warehouse my_warehouse:

SELECT
    query_parameterized_hash
    , ANY_VALUE(query_text)
    , SUM(calls) AS execution_count
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
    AND warehouse_name = 'MY_WAREHOUSE'
    AND interval_start_time > '2023-11-01'
    AND interval_start_time < '2023-11-08'
GROUP BY
    query_parameterized_hash
ORDER BY execution_count DESC
;

要按平均总延迟确定速度最慢的查询,请执行以下操作:

SELECT
    query_parameterized_hash
    , any_value(query_text)
    , SUM(total_elapsed_time:"sum"::NUMBER) / SUM (calls) as avg_latency
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
    AND warehouse_name = 'MY_WAREHOUSE'
    AND interval_start_time > '2023-07-01'
    AND interval_start_time < '2023-07-08'
GROUP BY
    query_parameterized_hash
ORDER BY avg_latency DESC
;

要分析感兴趣的特定查询随时间推移的性能,请执行以下操作:

SELECT
    interval_start_time
    , total_elapsed_time:"avg"::number avg_elapsed_time
    , total_elapsed_time:"min"::number min_elapsed_time
    , total_elapsed_time:"p90"::number p90_elapsed_time
    , total_elapsed_time:"p99"::number p99_elapsed_time
    , total_elapsed_time:"max"::number max_elapsed_time
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
    AND query_parameterized_hash = '<123456>'
    AND interval_start_time > '2023-07-01'
    AND interval_start_time < '2023-07-08'
ORDER BY interval_start_time DESC
;