了解 Cortex Search 服务的成本¶

成本类别

Cortex Search 服务会产生以下类型的成本：

Category	Description
Virtual warehouse compute	A Cortex Search Service requires a virtual warehouse to refresh the service: to run queries against base objects when they are initialized and refreshed, including orchestrating text embedding jobs and building the search index. These operations use compute resources, which consume credits. If no changes are identified during a refresh, virtual warehouse credits aren’t consumed since there’s no new data to refresh.
EMBED_TEXT tokens compute	A Cortex Search Service automatically embeds each text row in the search column specified in the `ON` parameter into vector space to enable semantic search, which incurs a credit cost per token embedded. This involves calling EMBED_TEXT_768 or EMBED_TEXT_1024 to convert each document as a series of numbers that encodes its meaning. Embeddings are computed each time a row is inserted or updated. Embeddings are processed incrementally in the evaluation of the source query, so the embedding cost is only incurred for added or changed documents. See Vector Embeddings for more information on vector embedding costs.
Multi-index Cortex Search	Multi-index Cortex Search Services have costs dependent on how you embed tokens and the number of columns you index. Larger embedding vectors or higher numbers of index columns incur higher costs. Embeddings are computed each time a row is inserted or updated. Embeddings are processed incrementally in the evaluation of the source query, so the embedding cost is only incurred for added or changed documents.
Serving compute	A Cortex Search Service uses multi-tenant serving compute, separate from a user-provided Virtual Warehouse, to establish a low-latency, high-throughput service. The compute cost for this component is incurred per GB per month (GB/mo) of uncompressed indexed data, where indexed data is the user-provided data in the Cortex Search source query, plus vector embeddings computed on the user’s behalf. You incur these costs while the service is available to respond to queries, even if no queries are served during a given period. For the Cortex Search Serving credit rate per GB/mo of indexed data, see the Snowflake Service Consumption Table.
Storage	Cortex Search Services materialize the source query into a table stored in your account. This table is transformed into data structures that are optimized for low-latency serving, also stored in your account. Storage for the table and intermediate data structures are based on a flat rate per terabyte (TB).
Cloud services compute	Cortex Search Services use Cloud Services compute to identify changes in underlying base objects and whether the virtual warehouse needs to be invoked. Cloud services compute cost is subject to the constraint that Snowflake only bills if the daily cloud services cost is greater than 10% of the daily warehouse cost for the account.

本主题提供有关这些成本的信息，以及有效管理这些成本的建议。

管理索引成本

您可能会发现以下提示有助于管理 Cortex Search 服务的索引成本：

Minimize warehouse size: 除 LARGE 仓库外，大多数服务的索引性能都没有改善，许多服务只需要 MEDIUM 即可。构建索引所用的大部分计算时间都由文本嵌入函数使用，当它已经有足够的资源时，文本嵌入函数无法从更多内核或额外内存中受益。
Suspend indexing when freshness isn’t important: Suspend indexing (or increase target lag) when you don’t need changes in your documents to be immediately propagated to the search service (that is, when freshness isn’t as important during some period).
Set target lag according to business requirements: 并非每个搜索应用程序都需要实时索引。目标延迟过低可能会导致索引刷新频率超出必要的频率。例如，如果您的源数据每五分钟更新一次，但数据的使用者每小时仅查询一次搜索服务，则将目标延迟设置为一小时，而不是五分钟。
Define primary keys: Defining primary keys on your Cortex Search Service can result in significant reductions to both the cost and latency of indexing. Services with primary keys can make use of an optimized refresh path when the underlying data changes, particularly when the number of changes since the last refresh is small and the last refresh occurred within the previous week. For more information on defining primary keys, see Primary keys.
Bundle changes together: There is a fixed component to the cost of an update, so fewer, bigger updates are less expensive than more frequent, smaller updates. Likewise, any change to any value within a row triggers the search column in that row to be embedded again, even if the data within that search column is unchanged, so it is better to accumulate all the changes to a row into a single update. For more information about vector embedding costs, see Vector Embeddings.
Minimize changes to the source data: 源查询架构的任何更改都会导致服务的全面刷新，包括向量嵌入和索引。创建大型服务时，考虑添加额外的负载列以备将来使用，这样在需要添加列时无需更改架构即可触发完全刷新。额外列的成本很低。
Tip
Materializing data in a table in the source query with a CREATE OR REPLACE command causes the service to fully refresh and embed all vectors again. It’s better to update the source table incrementally (for example, with MERGE INTO). For more information about vector embedding costs, see Vector Embeddings.
Keep the source query as simple as possible: 联接或其他复杂操作可能会增加索引成本（最好在 ETL 期间或在其他暂存区应用）。有关优化管道的更多信息，请参阅动态表最佳实践。

管理服务成本

您可能会发现以下提示有助于管理 Cortex Search 服务的服务成本：

Suspend serving when it isn’t in use: A running search service incurs costs even when it isn’t serving queries. For predictable idle periods (for example, during development), suspend the service manually. For services with intermittent traffic, set the AUTO_SUSPEND property so that serving is suspended after a period of inactivity and resumed when the next query arrives. For more information, see Auto-suspend serving on inactivity.

观察成本

To learn more about the costs of your Cortex Search services, use the following Account Usage views.

CORTEX_SEARCH_DAILY_USAGE_HISTORY view contains daily totals for EMBED_TEXT tokens compute and serving credit compute usage per service. Snowflake intends to also provide virtual warehouse usage in this view in the future.
CORTEX_SEARCH_SERVING_USAGE_HISTORY view includes hourly serving credits per service.

Snowflake 打算将来在 Cortex Search 管理界面中提供这些信息。

估算成本

EMBED_TEXT 词元计算¶

EMBED_TEXT tokens compute is charged per token of text in the search column, per document, charged in to on the cost of the credit rate of the selected embedding model. This compute cost is incurred for each row that is inserted or updated, including for each row in the ON column during the initialization of the service and every insert or update thereafter. For information on the per-token cost of each embedding model, see Cortex Search Embedding Models:

例如，如果您在包含 1000 万行且每行有 500 个词元的源查询上创建服务，并且所选的嵌入模型每 100 万个词元产生 0.05 个 Credit，则您需要为初始刷新支付以下费用：

（每 100 万个词元 0.05 个 Credit）（10,000,000 行）（每行 500 个 Credit）/（1,000,000 个词元）
= 250 个 Credit

对于此后插入或更新的每一行，每 100 万个词元将产生 0.05 个 Credit 的费用。

Tip

As an approximation, one token is equivalent to about 3/4 of an English word, or around 4 characters. To get an accurate estimate of tokens per row, use the COUNT_TOKENS function with a representative sample of your actual data.

提供计算服务

Serving compute is charged per gigabyte-month of indexed data, where indexed data is the user-provided data in the Cortex Search source query, plus vector embeddings computed on the user’s behalf. This is an ongoing cost that is incurred as long as the service’s serving status is resumed. This cost is based on the number of rows indexed, the size of the total indexed data, and the dimensionality of the selected vector embedding model. For information on the dimensionality of each embedding model, see Cortex Search Embedding Models:

例如，如果您的服务包含 1000 万行，所选嵌入模型的维度为 768，源查询中的每行约为 1,000 个字节（包括搜索列），并且索引数据每月每 GB 的 Credit 成本为 6.3，则您每月需要支付以下费用：

（每 GB 6.3 个 Credit）（10,000,000 行）（768 个维度 * 每个维度 4 字节 + 每行 1,000 字节）/（每行 1,000,000,000 字节）/（每 GB 1,000,000,000 字节）
= 每月 256.5 个 Credit

Note

无论列被指定为搜索列还是属性列，每行数据的大小都因用例而异，并且会随着服务索引的数据量（行数和列数）的增加而增加。

多列索引 Cortex Search¶

多列索引搜索服务通常每行会存储更多数据，以容纳额外的索引列。使用的总数据量取决于表的大小以及索引的数量。

To estimate the monthly serving cost for a multi-index service, use the following formula, where n is the number of vector index columns, d is the average number of vector dimensions, and r is the number of rows:

(每 GB 6.3 个 Credit) * r * (n * d * (每维度 4 字节) + 每行 1,000 字节) / (每 GB 1,000,000,000 字节)

例如，如果您的服务包含 1,000 万行，且拥有 2 个各为 768 维向量的向量索引，预计每月支付的费用如下：

(每 GB 6.3 个 Credit) * (10,000,000 行) * ((2 个向量索引列) * (768 个向量维度) * (每维度 4 字节) + 每行 1,000 字节) / (每 GB 1,000,000,000 字节)
= 每月 448.1 个 Credit

仓库计算

The virtual warehouse compute cost for Cortex Search Services can vary based on the change rate of your data, target lag, and warehouse size. In general, Cortex Search Services with lower target lag values and higher change rates on underlying data will incur higher Warehouse-related compute costs.

Tip
要清楚地了解与您的 Cortex Search 管道相关的仓库成本，请使用专用仓库测试 Cortex Search，这样您就可以隔离因 Cortex Search 刷新而产生的虚拟仓库使用量。在建立成本基准后，您可以将 Cortex Search 服务移至共享仓库。

存储

Cortex Search Services require storage to store the materialized results of the source query, as well as the search index. The size of the data stored can be estimated by materializing the source query into a table using the CORTEX_SEARCH_DATA_SCAN table function, and then examining the size of that table.

For detailed information about how this storage incurs cost, see Understanding storage cost.

云服务

Cortex Search Services use Cloud Services compute to trigger refreshes when an underlying base object has changed. These costs can vary based on the change rate of your data, target lag, and warehouse size. Cloud services cost for change tracking in Cortex Search tend to be lower for use-cases with low change rates. Cloud services compute cost is subject to the constraint that Snowflake only bills if the daily cloud services cost is greater than 10% of the daily warehouse cost for the account.