Performance improvements for operational queries on hybrid tables¶
Transactional workloads often execute the same short-running statements repeatedly. This public preview introduces additional automatic performance improvements for queries on hybrid tables that modify or select a small number of rows and execute in less than 100 ms. We refer to these queries as “operational queries”.
This optimization improves latency and throughput under very high query loads by reducing the setup costs for processing each query. It builds upon the existing hybrid table features that already remove much of the per-query overhead from parsing, plan compilation, and optimization through reuse and caching. Snowflake automatically recognizes recurring, structurally consistent query patterns and optimizes their execution, delivering significantly
lower latency and better price-performance for operational workloads.
To enable this public preview, set the ENABLE_USE_STABLE_PATH parameter to TRUE on a warehouse. You can use an existing warehouse or create a new one for testing.
Important
Because this feature is in public preview, don’t set the ENABLE_USE_STABLE_PATH=true parameter on a warehouse that runs production workloads.
Snowflake billing is unchanged by the optimizations in this public preview. In practice, workloads that leverage these optimizations
often see reduced total costs because of improved efficiency, reducing the amount of warehouse time to execute.
Snowflake currently does not bill for any cloud services compute credits incurred for queries that run on the optimized path. This will change when this feature reaches general availability. For more information about costs for using hybrid tables, see Evaluate cost for hybrid tables.
Operational queries are hybrid table queries that modify or select a small number of rows and execute in less than 100 ms. An operational query is eligible for the new optimization scheme when:
The same SQL statement structure is executed repeatedly.
The query operates exclusively on hybrid tables.
The size of the query result does not exceed 100 KB.
When Snowflake observes repeated executions of an eligible query, subsequent executions benefit from these optimizations. If the statement
is parameterized (for example, by using bind variables via a driver), queries with the same structure but different bind values are also eligible.
When you use this public preview, many operational workloads may experience up to 8x better throughput. For example, in internal testing on a Gen2 warehouse using a 100% read workload from the Yahoo Cloud Serving Benchmark (YCSB) (https://github.com/brianfrankcooper/YCSB), Snowflake observed up to 8x better throughput compared to running the same workload without these optimizations.
Optimization is strictly limited to SQL statements that interact with hybrid tables only. Queries that join or access standard Snowflake tables
are not optimized by this feature.
The following SQL statements benefit from the optimizations in this public preview:
Only the following join patterns are optimized in this public preview:
Primary Key to Primary Key joins: The join condition matches the primary key columns on both sides of the join.
Foreign Key to Foreign Key joins: The join condition matches foreign key columns on both sides of the join.
Index nested loop joins on a primary key: For each row from the outer table, Snowflake uses the primary key index on the inner hybrid table to look up matching rows. This pattern is efficient when one side of the join is small or highly selective and the other side is joined on its primary key.
Other join patterns continue to work but will not benefit from the full set of optimizations.
Queries that contain the following features and constructs continue to perform as expected, but do not benefit from the optimizations in this public preview.
Transactions: This preview supports only single-statement transactions with AUTOCOMMIT enabled.
Multi-statement transactions are not supported in this preview.
Observability: Snowflake tracks query execution in two system views:
QUERY_HISTORY records the first execution of each query in detail.
AGGREGATE_QUERY_HISTORY logs later successful runs of the same query to reduce logging volume.
Currently, neither view shows whether a specific execution used the optimizations in this public preview. For eligible repeatable queries, execution latency should consistently fall in the low double-digit millisecond range within a few minutes of starting a new session.
To accurately measure the throughput and latency benefits of these optimizations, Snowflake recommends simulating a realistic workload using JMeter (https://jmeter.apache.org/). A JMeter test setup (https://github.com/sfc-gh-dadhikari/htbench) is available to measure QPS under a 90% read and 10% write workload.
Your application should use a Snowflake driver for running parameterized queries and follow best practices. The Snowflake JDBC driver is recommended for this purpose.
Using Snowsight, set up a dedicated role for testing as well as a dedicated warehouse and schema.
USEROLEACCOUNTADMIN;-- Create role HT_BENCH_ROLECREATE OR REPLACEROLE ht_bench_role;GRANTROLE ht_bench_role TOROLEACCOUNTADMIN;-- Create HT_BENCH_WH warehouseCREATE OR REPLACEWAREHOUSE ht_bench_wh
WAREHOUSE_SIZE=XSMALLAUTO_SUSPEND=300AUTO_RESUME=TRUE;GRANTOWNERSHIPONWAREHOUSE ht_bench_wh TOROLE ht_bench_role;GRANTCREATEDATABASEONACCOUNTTOROLE ht_bench_role;-- Create HT_BENCH_DB database and schemaCREATE OR REPLACEDATABASE ht_bench_db;GRANTOWNERSHIPONDATABASE ht_bench_db TOROLE ht_bench_role;CREATE OR REPLACESCHEMAdata;GRANTOWNERSHIPONSCHEMA ht_bench_db.dataTOROLE ht_bench_role;-- Use roleUSEROLE ht_bench_role;-- Set context to use HT_BENCH_DB database and DATA schemaUSEDATABASE ht_bench_db;USESCHEMA ht_bench_db.data;-- Create hybrid tableCREATE OR REPLACEHYBRIDTABLE icecream_orders (
id NUMBER(38,0)NOT NULL,
store_id NUMBER(38,0)NOT NULL,
flavor VARCHAR(20)NOT NULL,
num_scoops NUMBER(38,0),PRIMARY KEY(id));-- Insert some dataINSERTINTO icecream_orders
SELECTSEQ4()ASID,UNIFORM(1,100,RANDOM())ASSTORE_ID,ARRAY_CONSTRUCT('VANILLA','CHOCOLATE','STRAWBERRY','MANGO','PISTACHIO')[UNIFORM(0,4,RANDOM())]::STRINGASFLAVOR,UNIFORM(1,5,RANDOM())ASNUM_SCOOPSFROMTABLE(GENERATOR(ROWCOUNT=>10000));
For the user to connect without a password, you need to configure the public key. Copy the public key from rsa_key.pub and paste it
into the following SQL:
USEROLEACCOUNTADMIN;CREATE OR REPLACEUSER ht_bench_user DEFAULT_ROLE= ht_bench_role;-- Copy the key text from the rsa_key.pub file created in the previous stepALTERUSER ht_bench_user SETRSA_PUBLIC_KEY='MIIBIjANBgkqh...<replace with your key>';-- Finally, grant the role to the userGRANTROLE ht_bench_role TOUSER ht_bench_user;
To accurately measure the throughput and latency benefits of these optimizations, Snowflake recommends simulating a realistic workload
using JMeter (https://jmeter.apache.org/). When running benchmarks, expect a warmup period of 1-2 minutes before the system reaches a steady state. During this time, Snowflake recognizes recurring query patterns and applies optimizations.
JMeter is an open-source load testing and performance measurement tool that can be used to benchmark and analyze the performance of hybrid
tables. It allows users to create and execute test plans that simulate concurrent workloads, measure throughput, and analyze system behavior
under load.
You can use SDKMAN! (https://sdkman.io/), an open-source tool to easily configure your Java version.
Save the following JMeter test plan as snowflake-hybrid-tables.jmx. The plan performs point-lookup SELECT queries against the icecream_orders hybrid table created earlier in this guide.
Show JMeter test plan (snowflake-hybrid-tables.jmx)
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlanversion="1.2"properties="5.0"jmeter="5.6.3">
<hashTree>
<TestPlanguiclass="TestPlanGui"testclass="TestPlan"testname="Hybrid Tables Test Plan">
<elementPropname="TestPlan.user_defined_variables"elementType="Arguments"guiclass="ArgumentsPanel"testclass="Arguments"testname="User Defined Variables">
<collectionPropname="Arguments.arguments">
<elementPropname="NUMBER_OF_KEYS"elementType="Argument">
<stringPropname="Argument.name">NUMBER_OF_KEYS</stringProp>
<stringPropname="Argument.value">${__P(NUMBER_OF_KEYS,100)}</stringProp>
<stringPropname="Argument.desc">The number of keys to sample for each loop to iterate on.</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="NUMBER_OF_THREADS"elementType="Argument">
<stringPropname="Argument.name">NUMBER_OF_THREADS</stringProp>
<stringPropname="Argument.value">${__P(NUMBER_OF_THREADS,10)}</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="ACCOUNT_LOCATOR"elementType="Argument">
<stringPropname="Argument.name">ACCOUNT_LOCATOR</stringProp>
<stringPropname="Argument.value">${__P(ACCOUNT_LOCATOR,)}</stringProp>
<stringPropname="Argument.desc">The account locator (without snowflakecomputing.cn)</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="WAREHOUSE_NAME"elementType="Argument">
<stringPropname="Argument.name">WAREHOUSE_NAME</stringProp>
<stringPropname="Argument.value">${__P(WAREHOUSE_NAME,)}</stringProp>
<stringPropname="Argument.desc">The Snowflake warehouse used for hybrid table testing.</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="DATABASE_NAME"elementType="Argument">
<stringPropname="Argument.name">DATABASE_NAME</stringProp>
<stringPropname="Argument.value">${__P(DATABASE_NAME,)}</stringProp>
<stringPropname="Argument.desc">The database to use for testing.</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="SCHEMA_NAME"elementType="Argument">
<stringPropname="Argument.name">SCHEMA_NAME</stringProp>
<stringPropname="Argument.value">${__P(SCHEMA_NAME,DATA)}</stringProp>
<stringPropname="Argument.desc">The schema name to use when connecting</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="USER_NAME"elementType="Argument">
<stringPropname="Argument.name">USER_NAME</stringProp>
<stringPropname="Argument.value">${__P(USER_NAME,)}</stringProp>
<stringPropname="Argument.desc">The username to use when connecting</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="ROLE_NAME"elementType="Argument">
<stringPropname="Argument.name">ROLE_NAME</stringProp>
<stringPropname="Argument.value">${__P(ROLE_NAME,)}</stringProp>
<stringPropname="Argument.desc">The role name to use when connecting</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="PRIVATE_KEY_FILE_PATH"elementType="Argument">
<stringPropname="Argument.name">PRIVATE_KEY_FILE_PATH</stringProp>
<stringPropname="Argument.value">${__P(PRIVATE_KEY_FILE_PATH,)}</stringProp>
<stringPropname="Argument.desc">The private key file name to use</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
<elementPropname="STABLE_PATH"elementType="Argument">
<stringPropname="Argument.name">STABLE_PATH</stringProp>
<stringPropname="Argument.value">${__P(STABLE_PATH,TRUE)}</stringProp>
<stringPropname="Argument.desc">Whether to enable the operational query optimization (TRUE or FALSE).</stringProp>
<stringPropname="Argument.metadata">=</stringProp>
</elementProp>
</collectionProp>
</elementProp>
</TestPlan>
<hashTree>
<JDBCDataSourceguiclass="TestBeanGUI"testclass="JDBCDataSource"testname="Snowflake JDBC">
<boolPropname="autocommit">true</boolProp>
<stringPropname="checkQuery"></stringProp>
<stringPropname="connectionAge">5000</stringProp>
<stringPropname="dataSource">jdbcConfig</stringProp>
<stringPropname="dbUrl">jdbc:snowflake://${ACCOUNT_LOCATOR}.snowflakecomputing.cn</stringProp>
<stringPropname="driver">net.snowflake.client.jdbc.SnowflakeDriver</stringProp>
<boolPropname="keepAlive">true</boolProp>
<stringPropname="password"></stringProp>
<stringPropname="poolMax">0</stringProp>
<stringPropname="timeout">10000</stringProp>
<stringPropname="transactionIsolation">DEFAULT</stringProp>
<stringPropname="trimInterval">60000</stringProp>
<stringPropname="username"></stringProp>
<boolPropname="preinit">false</boolProp>
<stringPropname="connectionProperties">user=${USER_NAME};db=${DATABASE_NAME};schema=${SCHEMA_NAME};warehouse=${WAREHOUSE_NAME};role=${ROLE_NAME};private_key_file=${PRIVATE_KEY_FILE_PATH};JDBC_QUERY_RESULT_FORMAT=JSON;enablePutGet=false</stringProp>
<stringPropname="TestPlan.comments">Setup snowflake connection by adjusting the connection string as required.</stringProp>
</JDBCDataSource>
<hashTree/>
<ResultCollectorguiclass="RespTimeGraphVisualizer"testclass="ResultCollector"testname="Response Time Graph">
<boolPropname="ResultCollector.error_logging">false</boolProp>
<objProp>
<name>saveConfig</name>
<valueclass="SampleSaveConfiguration">
<time>true</time>
<latency>true</latency>
<timestamp>true</timestamp>
<success>true</success>
<label>true</label>
<code>true</code>
<message>true</message>
<threadName>true</threadName>
<dataType>true</dataType>
<encoding>false</encoding>
<assertions>true</assertions>
<subresults>true</subresults>
<responseData>false</responseData>
<samplerData>false</samplerData>
<xml>false</xml>
<fieldNames>true</fieldNames>
<responseHeaders>false</responseHeaders>
<requestHeaders>false</requestHeaders>
<responseDataOnError>false</responseDataOnError>
<saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
<assertionsResultsToSave>0</assertionsResultsToSave>
<bytes>true</bytes>
<sentBytes>true</sentBytes>
<url>true</url>
<threadCounts>true</threadCounts>
<idleTime>true</idleTime>
<connectTime>true</connectTime>
</value>
</objProp>
<stringPropname="filename"></stringProp>
<stringPropname="RespTimeGraph.interval">100</stringProp>
<stringPropname="RespTimeGraph.graphtitle">Response Time Graph</stringProp>
<intPropname="RespTimeGraph.graphtitlefondsize">3</intProp>
<intPropname="RespTimeGraph.linestrockwidth">2</intProp>
<intPropname="RespTimeGraph.lineshapepoint">4</intProp>
<intPropname="RespTimeGraph.legendsize">1</intProp>
<stringPropname="RespTimeGraph.seriesselectionmatchlabel">.*Query</stringProp>
<intPropname="RespTimeGraph.legendplacement">1</intProp>
<boolPropname="RespTimeGraph.seriesselection">true</boolProp>
</ResultCollector>
<hashTree/>
<!-- Setup Thread Group - Run once before benchmark -->
<SetupThreadGroupguiclass="SetupThreadGroupGui"testclass="SetupThreadGroup"testname="Setup - Init"enabled="true">
<stringPropname="ThreadGroup.on_sample_error">continue</stringProp>
<elementPropname="ThreadGroup.main_controller"elementType="LoopController"guiclass="LoopControlPanel"testclass="LoopController"testname="Loop Controller"enabled="true">
<boolPropname="LoopController.continue_forever">false</boolProp>
<stringPropname="LoopController.loops">1</stringProp>
</elementProp>
<stringPropname="ThreadGroup.num_threads">1</stringProp>
<stringPropname="ThreadGroup.ramp_time">0</stringProp>
<boolPropname="ThreadGroup.scheduler">false</boolProp>
</SetupThreadGroup>
<hashTree>
<JDBCSamplerguiclass="TestBeanGUI"testclass="JDBCSampler"testname="Set Stable Path on Warehouse"enabled="true">
<stringPropname="dataSource">jdbcConfig</stringProp>
<stringPropname="query">ALTER WAREHOUSE ${WAREHOUSE_NAME} SET ENABLE_USE_STABLE_PATH = ${STABLE_PATH}</stringProp>
<stringPropname="queryArguments"></stringProp>
<stringPropname="queryArgumentsTypes"></stringProp>
<stringPropname="queryTimeout">60</stringProp>
<stringPropname="queryType">Update Statement</stringProp>
<stringPropname="resultSetHandler">Store as String</stringProp>
<stringPropname="resultSetMaxRows"></stringProp>
<stringPropname="resultVariable"></stringProp>
<stringPropname="variableNames"></stringProp>
</JDBCSampler>
<hashTree/>
<JDBCSamplerguiclass="TestBeanGUI"testclass="JDBCSampler"testname="Fetch Sample IDs (Once)"enabled="true">
<stringPropname="dataSource">jdbcConfig</stringProp>
<stringPropname="queryType">Select Statement</stringProp>
<stringPropname="query">SELECT ID FROM ICECREAM_ORDERS SAMPLE (${NUMBER_OF_KEYS} ROWS);</stringProp>
<stringPropname="queryArguments"></stringProp>
<stringPropname="queryArgumentsTypes"></stringProp>
<stringPropname="variableNames">SAMPLED_ID</stringProp>
<stringPropname="resultVariable">sampledIds</stringProp>
<stringPropname="queryTimeout"></stringProp>
<stringPropname="resultSetMaxRows"></stringProp>
<stringPropname="resultSetHandler">Store as Object</stringProp>
<stringPropname="TestPlan.comments">Fetch sample IDs ONCE and store globally for all threads</stringProp>
</JDBCSampler>
<hashTree>
<JSR223PostProcessorguiclass="TestBeanGUI"testclass="JSR223PostProcessor"testname="Store IDs Globally"enabled="true">
<stringPropname="scriptLanguage">groovy</stringProp>
<stringPropname="parameters"></stringProp>
<stringPropname="filename"></stringProp>
<stringPropname="cacheKey">true</stringProp>
<stringPropname="script">
// Get the result set from the JDBC sampler
def ids = vars.getObject("sampledIds")
def idList = []
if (ids != null) {
ids.each { row ->
idList.add(row.get("ID").toString())
}
}
// Store as comma-separated string in JMeter properties (global)
def idString = idList.join(",")
props.put("GLOBAL_IDS", idString)
props.put("GLOBAL_IDS_COUNT", idList.size().toString())
log.info("Stored " + idList.size() + " IDs globally for all threads")
</stringProp>
</JSR223PostProcessor>
<hashTree/>
</hashTree>
</hashTree>
<ThreadGroupguiclass="ThreadGroupGui"testclass="ThreadGroup"testname="SELECT Thread Group">
<stringPropname="TestPlan.comments">This thread group will issue SELECT commands using shared IDs</stringProp>
<stringPropname="ThreadGroup.num_threads">${NUMBER_OF_THREADS}</stringProp>
<intPropname="ThreadGroup.ramp_time">1</intProp>
<boolPropname="ThreadGroup.same_user_on_next_iteration">true</boolProp>
<stringPropname="ThreadGroup.on_sample_error">continue</stringProp>
<elementPropname="ThreadGroup.main_controller"elementType="LoopController"guiclass="LoopControlPanel"testclass="LoopController"testname="Loop Controller">
<boolPropname="LoopController.continue_forever">true</boolProp>
<intPropname="LoopController.loops">-1</intProp>
</elementProp>
<boolPropname="ThreadGroup.scheduler">true</boolProp>
<stringPropname="ThreadGroup.duration">${__P(DURATION,300)}</stringProp>
<stringPropname="ThreadGroup.delay">0</stringProp>
</ThreadGroup>
<hashTree>
<!-- Pick random ID from global pool each iteration -->
<JSR223PreProcessorguiclass="TestBeanGUI"testclass="JSR223PreProcessor"testname="Pick Random ID"enabled="true">
<stringPropname="scriptLanguage">groovy</stringProp>
<stringPropname="parameters"></stringProp>
<stringPropname="filename"></stringProp>
<stringPropname="cacheKey">true</stringProp>
<stringPropname="script">
// Get global IDs (fetched once in setup)
def idString = props.get("GLOBAL_IDS")
if (idString != null && idString.length() > 0) {
def ids = idString.split(",")
def randomIndex = new Random().nextInt(ids.length)
vars.put("ID", ids[randomIndex])
} else {
log.error("GLOBAL_IDS not found - setup may have failed")
vars.put("ID", "1")
}
</stringProp>
</JSR223PreProcessor>
<hashTree/>
<JDBCSamplerguiclass="TestBeanGUI"testclass="JDBCSampler"testname="Orders Query">
<stringPropname="dataSource">jdbcConfig</stringProp>
<stringPropname="queryType">Prepared Select Statement</stringProp>
<stringPropname="query">SELECT * FROM ICECREAM_ORDERS WHERE ID = ?</stringProp>
<stringPropname="queryArguments">${ID}</stringProp>
<stringPropname="queryArgumentsTypes">INTEGER</stringProp>
<stringPropname="variableNames"></stringProp>
<stringPropname="resultVariable"></stringProp>
<stringPropname="queryTimeout"></stringProp>
<stringPropname="resultSetMaxRows"></stringProp>
<stringPropname="resultSetHandler">Store as Object</stringProp>
</JDBCSampler>
<hashTree>
<ResultCollectorguiclass="ViewResultsFullVisualizer"testclass="ResultCollector"testname="View Results Tree"enabled="false">
<boolPropname="ResultCollector.error_logging">false</boolProp>
<objProp>
<name>saveConfig</name>
<valueclass="SampleSaveConfiguration">
<time>true</time>
<latency>true</latency>
<timestamp>true</timestamp>
<success>true</success>
<label>true</label>
<code>true</code>
<message>true</message>
<threadName>true</threadName>
<dataType>true</dataType>
<encoding>false</encoding>
<assertions>true</assertions>
<subresults>true</subresults>
<responseData>false</responseData>
<samplerData>false</samplerData>
<xml>false</xml>
<fieldNames>true</fieldNames>
<responseHeaders>false</responseHeaders>
<requestHeaders>false</requestHeaders>
<responseDataOnError>false</responseDataOnError>
<saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
<assertionsResultsToSave>0</assertionsResultsToSave>
<bytes>true</bytes>
<sentBytes>true</sentBytes>
<url>true</url>
<threadCounts>true</threadCounts>
<idleTime>true</idleTime>
<connectTime>true</connectTime>
</value>
</objProp>
<stringPropname="filename"></stringProp>
</ResultCollector>
<hashTree/>
<ResultCollectorguiclass="GraphVisualizer"testclass="ResultCollector"testname="Lookup Response">
<boolPropname="ResultCollector.error_logging">false</boolProp>
<objProp>
<name>saveConfig</name>
<valueclass="SampleSaveConfiguration">
<time>true</time>
<latency>true</latency>
<timestamp>true</timestamp>
<success>true</success>
<label>true</label>
<code>true</code>
<message>true</message>
<threadName>true</threadName>
<dataType>true</dataType>
<encoding>false</encoding>
<assertions>true</assertions>
<subresults>true</subresults>
<responseData>false</responseData>
<samplerData>false</samplerData>
<xml>false</xml>
<fieldNames>true</fieldNames>
<responseHeaders>false</responseHeaders>
<requestHeaders>false</requestHeaders>
<responseDataOnError>false</responseDataOnError>
<saveAssertionResultsFailureMessage>true</saveAssertionResultsFailureMessage>
<assertionsResultsToSave>0</assertionsResultsToSave>
<bytes>true</bytes>
<sentBytes>true</sentBytes>
<url>true</url>
<threadCounts>true</threadCounts>
<idleTime>true</idleTime>
<connectTime>true</connectTime>
</value>
</objProp>
<stringPropname="filename"></stringProp>
</ResultCollector>
<hashTree/>
</hashTree>
</hashTree>
</hashTree>
</hashTree>
</jmeterTestPlan>
Note
For accurate benchmark results, run the JMeter client from a cloud virtual machine in the same cloud region as the target Snowflake deployment. The VM should have at least 8 cores so that the client itself is not the bottleneck when driving sustained concurrent load. Snowflake recommends a regional VM or a Snowpark Container Services deployment, so that measured throughput and latency reflect hybrid table performance rather than cross-region network effects or client-side CPU limits. Running JMeter from a laptop or a cross-region machine is supported but will significantly understate the throughput and overstate the latency you would observe in production.
This command runs JMeter for 5 minutes (300 seconds), using 150 concurrent threads that perform point lookups against a sampled set of 250 keys.
The test plan drives the load as fast as the warehouse can serve it; there is no client-side rate limit. Before the benchmark begins, the test
plan runs ALTER WAREHOUSE ... SET ENABLE_USE_STABLE_PATH = TRUE to enable the optimization. To measure the baseline without these optimizations,
re-run the command with -JSTABLE_PATH=FALSE. You can adjust the other parameters (JNUMBER_OF_THREADS, JDURATION, JNUMBER_OF_KEYS) to
simulate different scenarios.
After the run is complete, you can see the throughput (queries per second). Allow 1-2 minutes of warmup before the system reaches steady state. After
warmup, the output might show that QPS peaks at almost 5,000 QPS on an XSMALL warehouse:
The following statements should give you an idea of the types of queries that would benefit from the optimizations in this public preview.
To take advantage of these optimizations, these statements would need to be submitted repeatedly through a supported driver. For more information, see Using a driver for parameterized queries.
-- Simple point-lookupsSELECT*FROM icecream_orders WHERE id =100;-- Filtered scan with predicatesSELECT flavor, num_scoops FROM icecream_orders WHERE store_id =3AND num_scoops >=2;-- DMLUPDATE icecream_orders SET num_scoops =3WHERE id =500;INSERTINTO icecream_orders VALUES(99999999,42,'VANILLA',3);DELETEFROM icecream_orders WHERE id =500;
The following examples are queries that would not benefit from the optimizations in this public preview because of documented limitations.
-- GROUP BYSELECT flavor,COUNT(*)AS order_count
FROM icecream_orders
WHERE store_id =5GROUP BY flavor
ORDER BY order_count DESC;-- Joins (create a stores table if you want to run this query)SELECT*FROM icecream_orders o JOIN stores s ON o.store_id = s.id;-- Multi-statement transactions without AUTOCOMMITBEGIN;UPDATE icecream_orders SET num_scoops =3WHERE id =500;INSERTINTO icecream_orders VALUES(99999999,42,'VANILLA',3);COMMIT;-- Statements containing subqueries and/or limits > 1 are also not optimizedSELECT*FROM(SELECT id, store_id, flavor, num_scoops
FROM icecream_orders
WHERE num_scoops >=2) sub
LIMIT10;
For this public preview, Snowflake recommends the Snowflake JDBC driver, especially when you use it with a connection pool.
Executions through other drivers, such as the Go Snowflake driver, also benefit from these improvements.
While the Snowflake connector for Python is supported, workloads using it may experience more limited benefits because of differences in
connection management, including the lack of built-in connection pooling.