Cortex Search¶

Get started with Cortex Search

概述¶

Cortex Search 支持对 Snowflake 数据进行低延迟、高质量的“模糊”搜索。Cortex Search 为 Snowflake 用户提供了广泛的搜索体验，包括利用大型语言模型 (`) 的 RAG检索增强生成 (<https://en.wikipedia.org/wiki/Prompt_engineering#Retrieval-augmented_generation (link removed)>) `_LLMs 应用程序。

使用 Cortex Search 只需几分钟即可在文本数据上运行混合（矢量和关键字）搜索引擎，而无需担心嵌入、基础设施维护、搜索质量参数调整或持续索引刷新等问题。这意味着可以将更少的时间用于基础设施和搜索质量调整，将更多的时间用于使用数据开发高质量的聊天和搜索体验。查看 Cortex Search 教程，了解使用 Cortex Search 为 AI 聊天和搜索应用程序提供支持的分步说明。

何时使用 Cortex Search¶

Cortex Search 的两个主要用例是检索增强生成 (RAG) 和企业搜索。

用于 LLM 聊天机器人的 RAG 引擎：通过利用语义搜索提供自定义的上下文响应，将 Cortex Search 用作文本数据聊天应用程序的 RAG 引擎。
企业搜索：使用 Cortex Search 作为后端，在应用程序中嵌入高质量的搜索栏。

用于 RAG 的 Cortex Search¶

检索增强生成 (RAG) 是一种从知识库中检索数据以增强大型语言模型的生成响应的技术。下面的架构图显示了如何将 Cortex Search 与 Cortex LLM 函数结合起来，使用 Snowflake 数据作为知识库来创建采用 RAG 的企业聊天机器人。

Cortex Search 是一个检索引擎，可为大型语言模型提供所需的上下文，从而返回基于最新专有数据的答案。

Example: Create and query a Cortex Search service¶

此示例将引导您完成创建 Cortex Search 服务并使用 REST API 对其进行查询的步骤。有关查询服务的更多详细信息，请参阅查询 Cortex Search 服务主题。

此示例使用一个示例客户支持文本记录数据集。

运行以下命令以设置示例数据库和架构。

CREATE DATABASE IF NOT EXISTS cortex_search_db;

CREATE OR REPLACE WAREHOUSE cortex_search_wh WITH
   WAREHOUSE_SIZE='X-SMALL';

CREATE OR REPLACE SCHEMA cortex_search_db.services;

运行以下 SQL 命令以创建数据集。

CREATE OR REPLACE TABLE support_transcripts (
    transcript_text VARCHAR,
    region VARCHAR,
    agent_id VARCHAR
);

INSERT INTO support_transcripts VALUES
    ('My internet has been down since yesterday, can you help?', 'North America', 'AG1001'),
    ('I was overcharged for my last bill, need an explanation.', 'Europe', 'AG1002'),
    ('How do I reset my password? The email link is not working.', 'Asia', 'AG1003'),
    ('I received a faulty router, can I get it replaced?', 'North America', 'AG1004');

创建服务¶

可以使用单个 SQL 查询或通过 Snowflake AI & ML Studio 创建 Cortex Search 服务。创建 Cortex Search 服务时，Snowflake 会对源数据进行转换，以便为低延迟服务做好准备。以下各节介绍如何使用 SQL 以及在 Snowsight 中通过 Snowflake AI & ML Studio 创建服务。

备注

创建搜索服务时，搜索索引是创建过程的一部分。这意味着对于较大的数据集，CREATE CORTEX SEARCH SERVICE 语句可能需要更长的时间才能完成。

使用 SQL¶

下面的示例演示了如何使用 CREATE CORTEX SEARCH SERVICE 在上一节创建的示例客户支持文本记录数据集上创建 Cortex Search 服务。

CREATE OR REPLACE CORTEX SEARCH SERVICE transcript_search_service
  ON transcript_text
  ATTRIBUTES region
  WAREHOUSE = cortex_search_wh
  TARGET_LAG = '1 day'
  EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
  AS (
    SELECT
        transcript_text,
        region,
        agent_id
    FROM support_transcripts
);

此命令将触发为数据构建搜索服务。在此示例中：

对服务的查询将在 transcript_text 列中搜索匹配项。

TARGET_LAG 参数指示 Cortex Search 服务将大约每天检查一次基表 support_transcripts 的更新。

系统将对列 region 和 agent_id 编制索引，以便与对 transcript_text 列的查询结果一起返回。

查询 transcript_text 列时，region 列将可用作筛选列。

仓库 cortex_search_wh 将用于首次物化指定查询的结果，并且每次基础表发生更改时，都会重新物化查询结果

备注

根据查询中指定的仓库大小和表中的行数，此 CREATE 命令可能需要几个小时才能完成。
Snowflake 建议为每项服务使用大小不超过 MEDIUM 的专用仓库。
必须通过显式枚举或通配符 (*) 将 ATTRIBUTES 字段中的列包含在源查询中。

使用 Snowsight¶

按照以下步骤在 Snowsight 中创建 Cortex Search 服务：

Sign in to Snowsight.
选择被授予 SNOWFLAKE.CORTEX_USER 数据库角色的角色。
In the navigation menu, select AI & ML » Cortex Search.
Select Create.
Select a role and warehouse.

The role must be granted the SNOWFLAKE.CORTEX_USER database role. The warehouse is used for materializing the results of the source query when the service is created and refreshed.
选择在其中定义服务的数据库和架构。
Enter a name for your service, then select Next.
Select data to be indexed.
- To select a table or view, select Table or view.
  
  Select the table or view that contains the text data to be indexed for searching, then select Next. For example, select the support_transcripts table.
- To select files from a stage, select Stage. (Preview)
  
  Select the stage that contains the files to be indexed for searching, then select Next.
备注

如果要在定义服务时指定多个数据源或执行转换，请使用 SQL。
If you selected Table or view:
- Select the columns you want included in the search results, for example, transcript_text, region, and agent_id, then select Next.
- Select the column that will be searched, for example, transcript_text, then select Next.
- If you want to be able to filter your search results based on particular columns, select those columns, then select Next. If you don't need any filters, select Skip this option.
If you selected Stage (Preview):
- Select the destination for your processed data, then select Next.
Select the configuration parameters for the service.

Set your target lag, which is the amount of time your service content should lag behind updates to the base data, then select Create.

最后一步确认服务已经创建，并显示服务名称及其数据源。

备注

从 Snowsight 创建服务时，服务的名称将用双引号引起来。有关在 SQL 中引用服务时这意味着什么的详细信息，请参阅加双引号的标识符。

授予使用权限¶

创建服务和索引后，您可以将服务、其数据库和架构的使用权限授予customer_support 等其他角色。

GRANT USAGE ON DATABASE cortex_search_db TO ROLE customer_support;
GRANT USAGE ON SCHEMA services TO ROLE customer_support;

GRANT USAGE ON CORTEX SEARCH SERVICE transcript_search_service TO ROLE customer_support;

预览服务¶

要确认服务已正确填充数据，您可以通过 SQL 环境中的 SEARCH_PREVIEW 函数预览服务：

SELECT PARSE_JSON(
  SNOWFLAKE.CORTEX.SEARCH_PREVIEW(
      'cortex_search_db.services.transcript_search_service',
      '{
        "query": "internet issues",
        "columns":[
            "transcript_text",
            "region"
        ],
        "filter": {"@eq": {"region": "North America"} },
        "limit":1
      }'
  )
)['results'] as results;

成功查询响应示例：

[
  {
  "transcript_text" : "My internet has been down since yesterday, can you help?",
  "region" : "North America"
  }
]

该响应确认服务中填充了数据，并为给定查询提供了合理的结果。

您还可以使用 CORTEX_SEARCH_DATA_SCAN 表函数检查服务内容。

SELECT
  *
FROM
  TABLE (
    CORTEX_SEARCH_DATA_SCAN (
      SERVICE_NAME => 'transcript_search_service'
    )
  );

+ ---------------------------------------------------------- + --------------- + -------- + ------------------------------ +
|                      transcript_text                       |     region      | agent_id | _GENERATED_EMBEDDINGS_MY_MODEL |
| ---------------------------------------------------------- | --------------- | -------- | ------------------------------ |
| 'My internet has been down since yesterday, can you help?' | 'North America' | 'AG1001' | [0.1, 0.2, 0.3, 0.4]           |
| 'I was overcharged for my last bill, need an explanation.' | 'Europe'        | 'AG1002' | [0.1, 0.2, 0.3, 0.4]           |
+ ---------------------------------------------------------- + --------------- + -------- + ------------------------------ +

从应用程序中查询服务¶

创建搜索服务、将其使用权限授予您的角色并预览之后，现在可以使用 Python API 从应用程序中对其进行查询。

下面的代码显示了使用 Python API 检索与 internet issues 查询最相关的支持票证，并进行筛选以返回 North America 区域的结果：

from snowflake.core import Root
from snowflake.snowpark import Session

CONNECTION_PARAMETERS = {"..."}

session = Session.builder.configs(CONNECTION_PARAMETERS).create()
root = Root(session)

transcript_search_service = (root
  .databases["cortex_search_db"]
  .schemas["services"]
  .cortex_search_services["transcript_search_service"]
)

resp = transcript_search_service.search(
  query="internet issues",
  columns=["transcript_text", "region"],
  filter={"@eq": {"region": "North America"} },
  limit=1
)
print(resp.to_json())

成功查询响应示例：

{
  "results": [
    {
      "transcript_text": "My internet has been down since yesterday, can you help?",
      "region": "North America"
    }
  ],
  "request_id": "5d8eaa5a-800c-493c-a561-134c712945ba"
}

Cortex Search 服务会返回在查询的 columns 字段中指定的所有列。

所需权限¶

To create a Cortex Search Service, your role must have the required privileges to use the Cortex embedding functions, which requires granting the SNOWFLAKE.CORTEX_USER database role or the SNOWFLAKE.CORTEX_EMBED_USER database role to the service creator role. You must also have the following privileges:
- The CREATE CORTEX SEARCH SERVICE or OWNERSHIP privilege on the schema where you create the service.
- The SELECT privilege on the underlying table(s) or view(s) that the service queries.
- The USAGE privilege on the warehouse that refreshes the service.
Change tracking must be enabled on all underlying objects used by a Cortex Search Service. For more information about change tracking requirements, see Change Tracking Requirements.
要查询 Cortex Search 服务，查询用户的角色必须对服务本身以及服务所在的数据库和架构具有 USAGE 权限。请参阅 Cortex Search 访问控制要求。
要使用 ALTER 命令暂停或恢复 Cortex Search 服务，查询用户的角色必须对该服务具有 OPERATE 权限。请参阅 ALTER CORTEX SEARCH SERVICE。

重要

Cortex Search 服务使用所有者权限执行搜索，并与使用所有者权限运行的其他 Snowflake 对象遵循相同的安全模型。有关更多信息，请参阅 Cortex Search 访问控制要求

了解 Cortex Search 质量¶

Cortex Search 利用检索和排名模型的集合提供高水平的搜索质量，几乎无需调整。在后台，Cortex Search 采用“混合”方法来检索和排名文档。每个搜索查询都会使用：

向量搜索，用于检索语义相似的文档。
关键字搜索，用于检索词汇相似的文档。
语义重排，用于对结果集中最相关的文档进行重排。

这种混合检索方法与语义重排步骤相结合，可在各种数据集和查询中实现较高的搜索质量。

You can customize the scoring of search results by applying numeric boosts, time decays, adjusting component weights, or disabling reranking. For more information, see Customizing Cortex Search scoring.

Cortex Search 嵌入模型¶

Cortex Search 允许用户选择托管嵌入模型，以便在检索的矢量搜索暂存区加以利用。以下嵌入模型在 Cortex Search 中可用。

重要

模型定价各不相同。权威模型定价可在 `Snowflake 服务使用量表 `_ 中找到。如果下方显示的价格与 Snowflake 服务使用量表中显示的型号价格不同，则以 Snowflake 服务消耗表为准。


模型名称	输出维度	上下文窗口大小（词元）	语言支持	描述
``snowflake-arctic-embed-m-v1.5``（默认）	768	512	仅限英语	Snowflake 最实用、仅限英语的嵌入模型。在 Cortex Search 的可用模型中，这种开源、1.1 亿个参数的模型可生成最快的索引时间。有关更多信息，请参阅 Arctic Embed 1.5 博客文章和 Arctic Embed 1.5 模型卡 (link removed)。
`snowflake-arctic-embed-l-v2.0`	1024	512	多语言	Snowflake's price-performant multilingual embedding model with a context window of 512 tokens. This open-source, 568M-parameter model yields high quality on both English and non-English datasets. For more information, see the Arctic Embed 2 blog post and Arctic Embed 2 model card (link removed).
`snowflake-arctic-embed-l-v2.0-8k`	1024	8192	多语言	Snowflake 性价比高的多语言嵌入模型，其扩展上下文窗口为 8000 个词元。该开源 5.68 亿参数的模型在英语和非英语数据集上均能提供高质量的效果。

每种模型都有不同的性能、成本、上下文窗口大小和质量特性。仔细查看模型规格，确定适合特定工作负载的最佳模型。请参阅 Snowflake 服务使用量表，以最准确地了解每种模型每百万个词元的 Credit 成本。

词元、模型上下文窗口和文本拆分¶

A token is a sequence of characters and is the smallest unit of text that can be processed by a large language model. As an approximation, one token is equivalent to about 3/4 of an English word, or around 4 characters. To calculate the number of tokens in a string, use the COUNT_TOKENS Cortex Function. For example, calculating the tokens for a string to be embedded with the snowflake-arctic-embed-m-v1.5 model:

SELECT SNOWFLAKE.CORTEX.COUNT_TOKENS('snowflake-arctic-embed-m', '<input_text>') as token_count

每个向量嵌入模型对文本输入都有固定大小的上下文窗口，具体大小可查看前述嵌入模型表。在索引和服务过程中，如果搜索列中的某个值的词元数超过上下文窗口大小，Cortex Search 会先将该字符串截断到上下文窗口的大小，再将其嵌入向量空间以进行语义搜索。但是，对于基于关键字的检索，Cortex Search 会使用文本的完整内容。

Snowflake 提供了内置函数，用于将文本拆分成更小的块。有关更多信息，请参阅 SPLIT_TEXT_RECURSIVE_CHARACTER。

For best search results with Cortex Search, Snowflake recommends splitting the text in your search column into chunks of no more than 512 tokens (about 385 English words). While there are longer-context embedding models available today, such as snowflake-arctic-embed-l-v2.0-8k, research shows that a smaller chunk size typically results in higher retrieval and downstream LLM response quality. With smaller chunks, retrieval can be more precise for a given query and, in a retrieval-augmented generation (RAG) scenario, the downstream LLM receives text chunks that are more relevant to the query.

刷新¶

Cortex Search 服务中提供的内容基于特定查询的结果。当 Cortex Search 服务的基础数据发生变化时，服务会更新以反映这些变化。这些更新称为刷新。此过程是自动化的，并且涉及到分析表的基础查询。

Cortex Search 服务具有与动态表相同的刷新属性。请参阅了解动态表初始化和刷新主题以了解 Cortex Search 服务的刷新特性。

Cortex Search Service 的源查询必须是动态表增量刷新的候选项。有关这些要求的详细信息，请参阅对增量刷新的支持。这一限制旨在防止与向量嵌入计算相关的任何不必要的失控成本。有关动态表增量刷新不支持的构造的详细信息，请参阅支持的动态表查询。

主键¶

Cortex Search Service 的主键是一组可选的列，用于唯一标识源查询中的每一行（即在指定列中，只有一行具有该组合的唯一值）。要与 Cortex Search Services 一起使用，主键列必须是 TEXT 数据类型。

创建服务时可以按如下方式指定主键：

CREATE OR REPLACE CORTEX SEARCH SERVICE transcript_search_service
  ON transcript_text
  PRIMARY KEY (region, agent_id)
  WAREHOUSE = cortex_search_wh
  TARGET_LAG = '1 day'
  AS (
    SELECT
        transcript_text, region, agent_id
    FROM support_transcripts
);

现有服务的主键列可以通过 ALTER CORTEX SEARCH SERVICE ... SET PRIMARY KEY (...) 进行修改。有关详细语法，请参阅 ALTER CORTEX SEARCH SERVICE。

Services with primary keys can make use of an optimized refresh path when data underlying the service changes. This optimized path can result in significant reductions to the cost and latency of a refresh. With this optimization enabled, the search service periodically compacts index information generated during a refresh. You can specify a target frequency for index refreshes by setting the FULL_INDEX_BUILD_INTERVAL_DAYS property on the service. For syntax details, see CREATE CORTEX SEARCH SERVICE and ALTER CORTEX SEARCH SERVICE.

备注

FULL_INDEX_BUILD_INTERVAL_DAYS is a soft target. Full rebuilds may occur more frequently than the specified interval to optimize serving performance based on factors such as service target lag, change rate in the service source data, and overall service size.

对具有主键的服务进行的查询也可以使用 @primarykey 过滤操作符。

重要

源查询中每一行的主键列值组合必须唯一。结果搜索索引中会忽略重复项。

Multi-index Cortex Search¶

Cortex Search can index multiple columns or use custom vector embeddings for queries, allowing you additional flexibility in how your Cortex Search Service interprets data and responds to user requests. You should use Multi-index Cortex Search when you have a use case that features one or more of:

Multiple search fields: Users need to search across different fields of a record.
User-provided vector embeddings: You have pre-computed vector embeddings for one or more columns prior to ingestion into the Cortex Search Service.
Mixed search types: You want to support searching different fields with preference to a type of search.
- Use text indexes for fields where exact or fuzzy keyword matches are important. Some examples are product codes, names, and categories.
- Use vector indexes for fields with longer text content where semantic understanding is valuable. Examples include product descriptions, user reviews, and support cases.
Field-specific relevance: Different fields of your data should contribute differently to relevance of a search result.

For example, for a product catalog search use case, you can create a multi-index service where:

Product names and SKUs are text indexes for precise lexical matching.
Product descriptions are vector indexes for semantic matching.
Category and brand names are both text and vector indexes to support both lexical and semantic matches.

For examples of creating a multi-index Cortex Search service, see CREATE CORTEX SEARCH SERVICE ... TEXT INDEXES .. VECTOR INDEXES. For examples of querying a multi-index service, see Query a Cortex Search service - Multi-index queries.

User-provided vector embeddings¶

Multi-index Cortex Search allows you to use pre-computed vector embeddings from any embedding model (including open-source, commercial, and custom-trained models). Use user-provided vector embeddings when:

You want to use an embedding model not natively available in Cortex Search, or you want to reuse embeddings you have already generate to reduce cost and improve performance.
You want to combine your vector embeddings with Cortex Search text indexes for hybrid retrieval.

When you specify a bare column name in the VECTOR INDEXES clause, but do not specify a model, Cortex Search treats the contents of the column as user-provided vector embeddings. User-provided vectors are indexed as-is and do not incur any embedding cost.

备注

You cannot load vectors directly into a Snowflake table. Instead, cast an array of numbers to the VECTOR data type when inserting or updating data in the source table for your Cortex Search Service. See 向量转换 for details and examples of how to do this.

Cortex Search chooses one of the following modes at search time, depending on whether you provide a query vector or query text in your search request:


Mode	Index time	Query time
Fully user-managed	Provide vectors in a VECTOR column	Provide a query vector via multi_index_query
User-managed with managed query embeddings	Provide vectors in a VECTOR column	Cortex Search embeds query text using the specified model

暂停索引和服务¶

Much like Dynamic Tables, Cortex Search Services automatically suspend their indexing state when they encounter five consecutive refresh failures related to the source query. If you encounter this failure for your service, you can view the specific SQL error using either DESCRIBE CORTEX SEARCH SERVICE or the CORTEX_SEARCH_SERVICES 视图. The output from both includes the following columns:

The INDEXING_STATE column, which is SUSPENDED for a suspended service.
The INDEXING_ERROR column, which contains the specific SQL error encountered in the source query.

在解决根本问题之后，可以使用 ALTER CORTEX SEARCH SERVICE <name> RESUME INDEXING 恢复服务。有关详细语法，请参阅 ALTER CORTEX SEARCH SERVICE。

成本注意事项¶

Cortex Search 服务会产生以下成本：


类别	描述
虚拟仓库计算	Cortex Search 服务需要虚拟仓库来刷新服务：在基础对象初始化和刷新对其运行查询，包括编排文本嵌入作业和构建搜索索引。这些操作使用计算资源，而计算资源会消耗 Credit。如果在刷新期间未发现任何更改，则不会消耗虚拟仓库 Credit，因为没有新数据可以刷新。
EMBED_TEXT 词元计算	Cortex Search 服务会自动将 `ON` 参数指定的搜索列中的每个文本行嵌入向量空间中，以启用语义搜索，这会产生每个嵌入词元的 Credit 成本。这涉及调用 EMBED_TEXT_768 或 EMBED_TEXT_1024，将每个文档转换为一系列编码其含义的数字。每次插入或更新行时都会计算嵌入。嵌入在源查询的评估中以增量方式处理，因此嵌入成本仅在添加或更改文档时产生。有关向量嵌入成本的更多信息，请参阅向量嵌入。
Multi-index Cortex Search	Multi-index Cortex Search Services have costs dependent on how you embed tokens and the number of columns you index. Larger embedding vectors or higher numbers of index columns incur higher costs. Embeddings are computed each time a row is inserted or updated. Embeddings are processed incrementally in the evaluation of the source query, so the embedding cost is only incurred for added or changed documents.
提供计算服务	Cortex Search 服务使用多租户服务计算（独立于用户提供的虚拟仓库），建立低延迟、高吞吐量的服务。此组件的计算成本是按每月每 GB（GB/月）的未压缩索引数据收取的，其中索引数据是 Cortex Search 源查询中用户提供的数据，加上代表用户计算的向量嵌入。即使在给定时间段内没有提供任何查询，您也会承担这些费用，而该服务可用于响应查询。有关每月每 GB 索引数据的 Cortex Search 服务 Credit 率，请参阅 Snowflake 服务消耗表。
存储	Cortex Search 服务将源查询物化为存储在账户户中的表。此表将转换为针对低延迟服务进行优化的数据结构，也存储在账户中。表和中间数据结构的存储基于每 TB 的统一费率 (TB)。
云服务计算	Cortex Search 服务使用云服务计算，以识别底层基本对象的变化以及是否需要调用虚拟仓库。云服务计算成本受到限制，Snowflake 仅在每日云服务成本大于账户每日仓库成本的 10% 时才计费。

有关管理 Cortex Search 服务成本的最佳实践，请参阅了解 Cortex Search 服务的成本。

要查看您账户中每个 Cortex Search 服务的 AI 服务 相关使用成本，每日汇总，请参阅 CORTEX_SEARCH_DAILY_USAGE_HISTORY 视图

已知限制¶

Cortex Search 的使用受到以下限制：

基表大小：搜索服务中的物化查询结果必须小于 1 亿行，以保持出色的服务性能。如果查询的物化结果超过 1 亿行，则创建查询将出错。

备注

要将 Cortex Search 服务的行扩展限制提高到 100 万以上，请联系 Snowflake 客户团队。
Throughput and rate limiting: Cortex Search returns a 429 HTTP status code if a client sends requests too quickly or if the service becomes overloaded. Client logic calling the search service should implement backoff and retry logic to handle these 429 responses gracefully.

备注

To increase throughput beyond 20 QPS for a single search service or 140 QPS across all services in your account, contact your Snowflake account team.
查询构造：Cortex Search 服务源查询必须遵守与动态表相同的查询限制。有关更多详细信息，请参阅动态表限制。
Data retention: Cortex Search Services have the same requirements as dynamic tables around data retentions. Specifically, you can't set the DATA_RETENTION_TIME_IN_DAYS object parameter in your base tables to zero or set this parameter on the schema or database containing the search service. Additionally, search services can become stale if they are not refreshed within MAX_DATA_EXTENSION_TIME_IN_DAYS. Once stale, they must be recreated to resume refreshes. Please see the 动态表限制 for more detail.
Cloning: Cortex Search Services do not currently support cloning. Snowflake intends to provide this capability in some future release, but cannot guarantee a specific timeline.
Table immutability: While running, your Cortex Search Services require tables they access aren't modified or dropped. To safely update tables used by a Cortex Search Service, stop the service before making your changes.

区域可用性¶

Cortex Search is available in the People's Republic of China.

法律声明¶

输入和输出的 Data Classification 如下表所示。


输入 Data Classification	输出 Data Classification
Usage Data	Customer Data

有关更多信息，请参阅 Snowflake AI 和 ML。