分析混合表的查询配置文件

Unistore workloads pose some interesting questions about query execution that you can investigate by using the Snowsight Query Profile feature or information gleaned from EXPLAIN output. In addition to monitoring overall performance and throughput, you may want to know if a table scan is being executed against the row store or object storage, or whether a specific secondary index is being used.

This section identifies Query Profile operators and attributes that pertain to hybrid table operations and presents some examples to help you understand how to read query plans that access hybrid tables. See also Monitor query activity with Query History.

混合表扫描和索引扫描

表和索引扫描运算符会在查询配置文件中出现,显示对混合表的访问。这些运算符通常出现在树形结构的底部,代表读取执行特定查询所需数据的第一步。针对标准表的查询始终使用表扫描;不使用索引扫描。

When a primary key index is used to scan a hybrid table, a TableScan operator appears in the query profile, not an IndexScan operator. When any other index is used to scan a hybrid table, such as a secondary index, you will see an IndexScan operator.

Under Attributes for the IndexScan operator, you can see the fully qualified name of the index and Access predicates. These are the predicates that are applied to the index during the scan. You can also see predicates for filters that are applied during table scans.

When a predicate is “pushed” to an index, the predicate contains a placeholder, inside parentheses, for the constant that was used in the query. For example: SENSOR_DATA_DEVICE2.DEVICE_ID = (:SFAP_PRE_NR_1)

扫描模式

混合表数据以两种格式存储,用于满足操作和分析工作负载的需求。管理员经常问的一个问题是,给定查询访问的是行存储还是列存储(对象存储)。查询可以从一种或两种类型的存储中读取,具体取决于所涉及的表、查询的具体要求、索引的可用性和其他因素。

The query profile for hybrid table queries includes a Scan Mode attribute for each table scan operator in the tree:

  • ROW-BASED: The query reads from the table data in the row store, or uses indexes to compute query results.
  • COLUMN-BASED: The query reads from an object storage copy of the same data that was loaded into the row store. Index scans can also access object storage, for Time Travel queries.

Scan mode is specific to hybrid tables. If a table scan is run on a standard table, no Scan Mode attribute is displayed.

从列式仓库缓存中读取的数据

Where possible, table scans for hybrid tables read data from a columnar warehouse cache. This cache is an extension to the standard warehouse cache; see Optimizing the warehouse cache. The cache contains data that has been read from the hybrid table storage provider and is accessible by read-only queries against hybrid tables.

To see cache usage in a given query profile, select the table scan operator and check the Percentage scanned from cache under Statistics.

Queries that select from hybrid tables do not benefit from the query results cache.

混合表请求节流

In the Profile Overview, you can see a Hybrid Table Requests Throttling percentage. To see this overview, do not select an operator in the tree; the overview applies to the whole query plan.

For example, the following query recorded that 87.5% of its execution time was spent being throttled by the hybrid table storage provider. A high throttling percentage is an indicator that too many hybrid table read and write requests are being sent to the storage provider, relative to the quota for the database. For more information, see Quotas and throttling.

Query profile overview shows a high throttling percentage for hybrid table requests.

示例

The following Snowsight examples of query profiles show attributes specific to hybrid table operations. To understand these examples, you do not need to create and load the tables that are queried and modified. However, here is the CREATE TABLE statement for one of the tables for reference. Note the definition of the PRIMARY KEY constraint (on the timestamp column) and a secondary index (on the device_id column):

CREATE OR REPLACE HYBRID TABLE sensor_data_device1 (
  timestamp TIMESTAMP_NTZ PRIMARY KEY,
  device_id VARCHAR(10),
  temperature DECIMAL(6,4),
  vibration DECIMAL(6,4),
  motor_rpm INT,
  INDEX device_idx(device_id)
 );

Another similar hybrid table, sensor_data_device2, is also used in the examples.

访问主键列的查询计划

When your query filters the primary key of the table (timestamp), which is automatically indexed, the query profile uses a TableScan operator. Also note that ROW_BASED scan mode is used for this query.

SELECT * FROM sensor_data_device1 WHERE timestamp='2024-03-01 13:45:56.000';
TableScan operator for query that filters on the primary key column, timestamp

访问二级索引的查询计划

生成此配置文件的查询如下所示:

SELECT COUNT(*) FROM sensor_data_device1 WHERE device_id='DEVICE2';

Only part of the profile is shown here, focusing on the IndexScan operator and its attributes. The scan mode is ROW_BASED, and you can see the complete predicate by hovering over Access Predicates. The fully qualified index name is also displayed.

IndexScan operator with attributes, including access predicate and ROW_BASED scan mode

See also INCLUDE columns.

混合表的 DML 查询计划

混合表上的 DML 操作通常会修改单行。例如:

UPDATE sensor_data_device2 SET device_id='DEVICE3' WHERE timestamp = '2024-04-02 00:00:05.000';

The query profile for the TableScan operator shows that this UPDATE accesses the row store for the hybrid table (scan mode is ROW_BASED):

Table scan operator that uses a ROW_BASED scan for a single-row UPDATE

利用缓存数据提高效率的重复查询

这种情况下,假设以下查询在混合表上快速连续运行两次。

SELECT device_id, AVG(temperature)
  FROM sensor_data_device2
  WHERE temperature>33
  GROUP BY device_id;

The first query reads all of the data from object storage. The second run of the query reads 100% of the data from the columnar cache. Also note that the scan mode for this query is COLUMN_BASED.

Table scan operator that read 100% of the data from the cache

联接(混合表与标准表)的查询计划

When you join a hybrid table to a standard table, you will see a Scan Mode attribute for the scan on the hybrid table, but not on the standard table. For example, the TableScan operator on the left side of this join plan used ROW_BASED scan mode. The order_header table is a hybrid table with order_id as its primary key (the joining column in this example). The other table, truck_history, is a standard table.

TableScan operator for a hybrid table in a join, including access predicate and ROW_BASED scan mode