监控混合表的工作负载

利用混合表的 Unistore 工作负载可能与在 Snowflake 中运行的许多分析工作负载不同。例如,您的工作负载可能包含较少的唯一查询,这些查询运行所需时间更少、执行频率更高。您可以通过多种可选方式来监控工作负载。

使用“Account Usage”视图监控工作负载

若要有效监控操作工作负载,请使用 AGGREGATE_QUERY_HISTORY 视图。通过此视图,您可以监控工作负载的运行状况、诊断问题并确定优化途径。AGGREGATE_QUERY_HISTORY 视图汇总了一段时间间隔内重复参数化查询的查询执行统计信息,以便更轻松、更高效地识别工作负载和查询随时间推移的模式。请注意,此视图的输出将合并所有 Snowflake 工作负载和查询。

AGGREGATE_QUERY_HISTORY 视图可帮助您回答以下有关工作负载的问题:

  • 我的虚拟仓库每秒执行多少次操作?

  • 在我的工作负载中,哪些查询消耗的总时间或资源量最多?

  • 随着时间的推移,特定查询的性能是否发生了重大变化?

为了帮助提高工作负载的性能和效率,低延迟操作的个别执行(不到一秒)将不会存储在 QUERY_HISTORY 视图 中,也不会生成唯一的查询配置文件。相反,该查询的重复执行的汇总统计信息将在 AGGREGATE_QUERY_HISTORY 视图中返回。您还可以查看选定时间间隔内,该查询的抽样查询配置文件。

监控整体工作负载运行状况

使用 AGGREGATE_QUERY_HISTORY 视图,以监控整体工作负载吞吐量和并发性,并调查工作负载中的意外峰值或低谷。例如:

SELECT
    interval_start_time
    , SUM(calls) as execution_count
    , SUM(calls) / 60 as queries_per_second
    , COUNT(DISTINCT session_id) as unique_sessions
    , COUNT(user_name) as unique_users
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > $START_DATE
          AND interval_start_time < $END_DATE
GROUP BY
          interval_start_time
;
Copy

您还可以使用汇总查询历史记录来监控错误、排队、锁定阻塞或限制等潜在问题。例如:

WITH time_issues AS
(
    SELECT
        interval_start_time
        , SUM(transaction_blocked_time:"SUM") as transaction_blocked_time
        , SUM(queued_provisioning_time:"SUM") as queued_provisioning_time
        , SUM(queued_repair_time:"SUM") as queued_repair_time
        , SUM(queued_overload_time:"SUM") as queued_overload_time
        , SUM(hybrid_table_requests_throttled_count) as hybrid_table_requests_throttled_count
    FROM snowflake.account_usage.aggregate_query_history
    WHERE TRUE
              AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > $START_DATE
          AND interval_start_time < $END_DATE
    GROUP BY ALL
),
errors AS
(
    SELECT
        interval_start_time
        , SUM(value:"count") as error_count
    FROM
    (
        SELECT
            a.interval_start_time
            ,e.*
        FROM
            snowflake.account_usage.aggregate_query_history a,
            TABLE(flatten(input => errors)) e
        WHERE TRUE
            AND interval_start_time > $START_DATE
                  AND interval_start_time < $END_DATE
  )
  GROUP BY ALL
)
    SELECT
        time_issues.interval_start_time
        , error_count
        , transaction_blocked_time
        , queued_provisioning_time
        , queued_repair_time
        , queued_overload_time
        , hybrid_table_requests_throttled_count
    FROM
        time_issues FULL JOIN errors ON errors.interval_start_time = time_issues.interval_start_time
;
Copy

此类指标通常应保持在较低水平。如果看到意外峰值,建议您调查原因。

确定和调查重复查询

您可以选择优化或调查常见查询和频繁执行的查询的性能,以提高工作负载的效率。使用 AGGREGATE_QUERY_HISTORY 视图,按执行计数确定一个工作负载的热门查询。例如:

SELECT
    query_parameterized_hash
    , any_value(query_text)
    , SUM(calls) as execution_count
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
GROUP BY
          query_parameterized_hash
ORDER BY execution_count DESC
;
Copy

您可以选择查看速度最慢的查询的指标。例如:

SELECT
    query_parameterized_hash
    , any_value(query_text)
    , SUM(total_elapsed_time:"sum"::NUMBER) / SUM (calls) as avg_latency
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
GROUP BY
          query_parameterized_hash
ORDER BY avg_latency DESC
;
Copy

您可以分析特定查询在一段时间内的性能,以深入了解延迟趋势。例如:

SELECT
    interval_start_time
    , total_elapsed_time:"avg"::number avg_elapsed_time
    , total_elapsed_time:"min"::number min_elapsed_time
    , total_elapsed_time:"p90"::number p90_elapsed_time
    , total_elapsed_time:"p99"::number p99_elapsed_time
    , total_elapsed_time:"max"::number max_elapsed_time
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND query_parameterized_hash = '<123456>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
ORDER BY interval_start_time DESC
;
Copy

此查询计算总查询时间。您还可以修改查询,以在查询的不同阶段(编译、执行、排队和锁定等待)返回更精细的指标。将返回每个阶段的汇总统计信息。

语言: 中文