监控混合表工作负载

利用混合表的 Unistore 工作负载将与在 Snowflake 中运行的许多分析工作负载不同。例如,您的工作负载可能包含较少的唯一查询,这些查询运行所需时间更少、执行频率更高。您可以通过多种可选方式来监控工作负载。

监控事务

Hybrid tables support Snowflake transaction monitoring features, including SHOW TRANSACTIONS, DESCRIBE TRANSACTION, SHOW LOCKS, and LOCK WAIT HISTORY.

混合表的这些命令和视图的行为与标准 Snowflake 表的行为一致,但以下更改除外:

  • A new ROW lock type is introduced in the SHOW LOCKS command to represent row locks against hybrid tables. The locks are summarized to show one transaction holding (one or multiple) row locks and another transaction waiting for these locks.
  • LOCK WAIT HISTORY does not show schema-related information.
  • LOCK_WAIT_HISTORY does not summarize BLOCKER_QUERIES. If a query is blocked by multiple blockers, then they will appear as multiple records in the view rather than as multiple entries in the BLOCKER_QUERIES JSON array for the single waiter record.
  • 对于 SHOW LOCKS 的结果和 LOCK_WAIT_HISTORY 视图:
    • 在汇总行锁时,假定持有锁的事务在启动时获取锁。
    • Due to the potential high volume of Unistore transactions, only locks that have blocked other transaction(s) for an extended period (approximately 5 seconds) are shown.
    • The lock-waiting transaction might still appear to be waiting for the locks even if it has acquired them (for no more than 1 minute). The accuracy of lock reporting will improve in future releases.
    • If a statement that blocked a waiting query has completed and was a short-running query against hybrid tables, the following information for the blocker query is not shown in the BLOCKER_QUERY field of the waiting query record:
      • 阻塞者查询的查询 UUID
      • 阻塞者查询的会话 ID
      • 阻塞者查询的用户名
      • 阻塞者查询的数据库 ID
      • 阻塞者查询的数据库名称

监控工作负载

To monitor your operational workloads effectively, use the AGGREGATE_QUERY_HISTORY view. This view enables you to monitor the health of your workload, diagnose issues, and identify avenues for optimization. The AGGREGATE_QUERY_HISTORY view aggregates query execution statistics for a repeated parameterized query over a time interval so that it is easier and more efficient to identify patterns in your workloads and queries over time. Note that all Snowflake workloads and queries will be combined in the output of this view.

AGGREGATE_QUERY_HISTORY 视图可帮助您回答以下有关工作负载的问题:

  • 我的虚拟仓库每秒执行多少次操作?
  • 在我的工作负载中,哪些查询消耗的总时间或资源量最多?
  • 随着时间的推移,特定查询的性能是否发生了重大变化?

To help improve performance and efficiency in your workload, individual executions of low latency operations (under one second) will not be stored in QUERY_HISTORY view nor will they generate a unique query profile. Instead, aggregate statistics for repeated executions of that query will be returned in the AGGREGATE_QUERY_HISTORY view. You will also be able to view a sampled query profile for the query over a selected time interval. For more information about this behavior, see Usage notes.

Tip

You can use the Grouped Query History view in Snowsight to visualize performance and statistics for typical hybrid table workloads. This view does not capture all hybrid table activity, but it provides a good alternative to monitoring performance for a large volume of individual queries that are somewhat repetitive and run extremely fast.

监控整体工作负载运行状况

使用 AGGREGATE_QUERY_HISTORY 视图,以监控整体工作负载吞吐量和并发性,并调查工作负载中的意外峰值或低谷。例如:

SELECT
    interval_start_time
    , SUM(calls) as execution_count
    , SUM(calls) / 60 as queries_per_second
    , COUNT(DISTINCT session_id) as unique_sessions
    , COUNT(user_name) as unique_users
FROM snowflake.account_usage.aggregate_query_history
WHERE warehouse_name = '<MY_WAREHOUSE>'
  AND interval_start_time > $START_DATE
  AND interval_start_time < $END_DATE
GROUP BY ALL;

您还可以使用汇总查询历史记录来监控错误、排队、锁定阻塞或限制等潜在问题。例如:

WITH time_issues AS
(
    SELECT
        interval_start_time
        , SUM(transaction_blocked_time:"SUM") as transaction_blocked_time
        , SUM(queued_provisioning_time:"SUM") as queued_provisioning_time
        , SUM(queued_repair_time:"SUM") as queued_repair_time
        , SUM(queued_overload_time:"SUM") as queued_overload_time
        , SUM(hybrid_table_requests_throttled_count) as hybrid_table_requests_throttled_count
    FROM snowflake.account_usage.aggregate_query_history
    WHERE WAREHOUSE_NAME = '<MY_WAREHOUSE>'
      AND interval_start_time > $START_DATE
      AND interval_start_time < $END_DATE
    GROUP BY ALL
),
errors AS
(
    SELECT
        interval_start_time
        , SUM(value:"count") as error_count
    FROM
    (
        SELECT
            a.interval_start_time
            ,e.*
        FROM
            snowflake.account_usage.aggregate_query_history a,
            TABLE(flatten(input => errors)) e
        WHERE interval_start_time > $START_DATE
          AND interval_start_time < $END_DATE
  )
  GROUP BY ALL
)
    SELECT
        ts.interval_start_time
        , error_count
        , transaction_blocked_time
        , queued_provisioning_time
        , queued_repair_time
        , queued_overload_time
        , hybrid_table_requests_throttled_count
    FROM time_issues ts
    FULL JOIN errors e ON e.interval_start_time = ts.interval_start_time
;

此类指标通常应保持在较低水平。如果看到意外峰值,建议您调查原因。

确定和调查重复查询

您可以选择优化或调查常见查询和频繁执行的查询的性能,以提高工作负载的效率。使用 AGGREGATE_QUERY_HISTORY 视图,按执行计数确定一个工作负载的热门查询。例如:

SELECT
    query_parameterized_hash
    , any_value(query_text)
    , SUM(calls) as execution_count
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
GROUP BY
          query_parameterized_hash
ORDER BY execution_count DESC
;

您可以选择查看速度最慢的查询的指标。例如:

SELECT
    query_parameterized_hash
    , any_value(query_text)
    , SUM(total_elapsed_time:"sum"::NUMBER) / SUM (calls) as avg_latency
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND warehouse_name = '<MY_WAREHOUSE>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
GROUP BY
          query_parameterized_hash
ORDER BY avg_latency DESC
;

您可以分析特定查询在一段时间内的性能,以深入了解延迟趋势。例如:

SELECT
    interval_start_time
    , total_elapsed_time:"avg"::number avg_elapsed_time
    , total_elapsed_time:"min"::number min_elapsed_time
    , total_elapsed_time:"p90"::number p90_elapsed_time
    , total_elapsed_time:"p99"::number p99_elapsed_time
    , total_elapsed_time:"max"::number max_elapsed_time
FROM snowflake.account_usage.aggregate_query_history
WHERE TRUE
          AND query_parameterized_hash = '<123456>'
          AND interval_start_time > '2024-02-01'
          AND interval_start_time < '2024-02-08'
ORDER BY interval_start_time DESC
;

此查询计算总查询时间。您还可以修改查询,以在查询的不同阶段(编译、执行、排队和锁定等待)返回更精细的指标。将返回每个阶段的汇总统计信息。