Snowflake Cortex AI Functions (including LLM functions)

Use Cortex AI Functions in Snowflake to run unstructured analytics on text and images with industry-leading LLMs from OpenAI, Anthropic, Meta, Mistral AI, and DeepSeek. AI Functions support use cases such as:

  • 提取实体以丰富元数据并简化验证
  • 汇总客户工单洞察分析
  • 使用自然语言对内容进行筛选和分类
  • 基于情绪和方面进行分析以改进服务
  • 翻译和本地化多语言内容
  • 解析用于分析和 RAG 管道的文档

All the LLMs that Snowflake provides access to via our Snowflake AI Features are deployed within the Snowflake Service perimeter.

可用函数

Snowflake Cortex features are provided as SQL functions and are also available in Python. Cortex AI Functions can be grouped into the following categories:

Cortex AI functions

特定于任务的函数是专门构建和管理的函数,可以自动执行日常任务,如简单的摘要和快速翻译,不需要任何自定义。

  • AI_COMPLETE: Generates a completion for a given text string or image using a selected LLM. Use this function for most generative AI tasks.

  • AI_CLASSIFY: Classifies text or images into user-defined categories.

  • AI_FILTER: Returns True or False for a given text or image input, allowing you to filter results in SELECT, WHERE, or JOIN … ON clauses.

  • AI_AGG: Aggregates a text column and returns insights across multiple rows based on a user-defined prompt. This function isn’t subject to context window limitations.

  • AI_EMBED: Generates an embedding vector for a text or image input, which can be used for similarity search, clustering, and classification tasks.

  • AI_EXTRACT: Extracts information from an input string or file, for example, text, images, and documents. Supports multiple languages.

  • AI_SENTIMENT: Extracts sentiment from text.

  • AI_SUMMARIZE_AGG: Aggregates a text column and returns a summary across multiple rows. This function isn’t subject to context window limitations.

  • AI_SIMILARITY: Calculates the embedding similarity between two inputs.

  • AI_TRANSCRIBE: Transcribes audio and video files stored in a stage, extracting text, timestamps, and speaker information.

  • AI_PARSE_DOCUMENT: Extracts text (using OCR mode) or text with layout information (using LAYOUT mode) from documents in an internal or external stage. Can also extract images found in a document.

  • AI_REDACT: Redact personally identifiable information (PII) from text.

  • AI_TRANSLATE: Translates text between supported languages.

  • SUMMARIZE (SNOWFLAKE.CORTEX): Returns a summary of the text that you’ve specified.

辅助函数

Helper functions are purpose-built managed functions that reduce cases of failures when running other Cortex AI Functions, for example by getting the count of tokens in an input prompt to ensure the call doesn’t exceed a model limit.

  • TO_FILE: Creates a reference to a file in an internal or external stage for use with AI_COMPLETE and other functions that accept files.

  • AI_COUNT_TOKENS: Given an input text, returns the token count based on the model or Cortex function specified.

  • PROMPT: Helps you build prompt objects for use with AI_COMPLETE and other functions.

  • TRY_COMPLETE (SNOWFLAKE.CORTEX): Works like the COMPLETE function, but returns NULL when the function could not execute instead of an error code.

Cortex Guard

Cortex Guard is an option of the AI_COMPLETE (or SNOWFLAKE.CORTEX.COMPLETE) function designed to filter possible unsafe and harmful responses from a language model. Cortex Guard is currently built with Meta’s Llama Guard 3. Cortex Guard works by evaluating the responses of a language model before that output is returned to the application. Once you activate Cortex Guard, language model responses which may be associated with violent crimes, hate, sexual content, self-harm, and more are automatically filtered. See COMPLETE arguments for syntax and examples.

Note

Usage of Cortex Guard incurs compute charges based on the number of input tokens processed, in addition to the charges for the AI_COMPLETE function.

性能注意事项

Cortex AI Functions are optimized for throughput. We recommend using these functions to process numerous inputs such as text from large SQL tables. Batch processing is typically better suited for AI Functions. For more interactive use cases where latency is important, use the REST API. These are available for simple inference (Complete API), embedding (Embed API) and agentic applications (Agents API).

Cortex LLM privileges

This section describes the privileges required for users to access Snowflake Cortex AI Functions. It covers how to control and grant access to these functions using roles and account-level privileges.

USE AI FUNCTIONS on the account privilege

Important

Your users need both the USE AI FUNCTIONS account-level privilege and one of the CORTEX_USER or AI_FUNCTIONS_USER database roles to use Snowflake Cortex AI Functions. Because USE AI FUNCTIONS is granted to the PUBLIC role by default, no additional action is needed for this privilege unless it has been revoked.

The USE AI FUNCTIONS account-level privilege includes the privileges that allow your users to call Snowflake Cortex AI functions. By default, the USE AI FUNCTIONS privilege is granted to the PUBLIC role. The PUBLIC role is automatically granted to all users and roles, allowing all users in your account to use the Snowflake Cortex AI functions. If you don’t want all your users to have this privilege, you can revoke access to the PUBLIC role and grant access to other roles.

This section explains how to do the following :

  • Revoke the USE AI FUNCTIONS privilege from the PUBLIC role
  • Grant the USE AI FUNCTIONS privilege to specific roles

Important

You must use the ACCOUNTADMIN role to manage the USE AI FUNCTIONS account-level privilege.

To revoke the USE AI FUNCTIONS account-level privilege from the PUBLIC role, run the following command:

REVOKE USE AI FUNCTIONS ON ACCOUNT
FROM ROLE PUBLIC;

Note

Revoking the USE AI FUNCTIONS account-level privilege prevents your users from accessing Snowflake Cortex AI Functions. Your users need both the USE AI FUNCTIONS account-level privilege and one of the CORTEX_USER or AI_FUNCTIONS_USER database roles to use Snowflake Cortex AI Functions.

After you’ve revoked the USE AI FUNCTIONS privilege from the PUBLIC role, you can use the ACCOUNTADMIN role to grant it to other roles in your Snowflake account.

The following example:

  1. Grants the USE AI FUNCTIONS privilege to cortex_user_role.
  2. Grants the cortex_user_role to example_user.
USE ROLE ACCOUNTADMIN;

CREATE ROLE cortex_user_role;

GRANT USE AI FUNCTIONS ON ACCOUNT TO ROLE cortex_user_role;

GRANT ROLE cortex_user_role TO USER example_user;

You can grant access to Snowflake Cortex AI Functions through roles that are commonly used by specific groups of users. For example, if you’ve created an analyst role that is used as a default role by analysts in your organization, you can grant these users access to Snowflake Cortex AI Functions with a single GRANT <privileges> … TO ROLE statement. For more information about granting privileges to commonly used roles, see User roles.

GRANT USE AI FUNCTIONS ON ACCOUNT TO ROLE analyst;

Important

Currently, USE AI FUNCTIONS does not apply to AI Function queries that are run inside Snowflake native applications. A query with AI Function calls runs successfully regardless of whether the role has USE AI FUNCTIONS privilege.

Using AI Functions with Restricted Caller’s Rights

To use AI Functions with Restricted Caller’s Rights, you must grant the USE AI FUNCTIONS privilege to both the session role and the service or application owner role.

For example, to use AI Functions inside a Snowflake Park Container Services (SPCS) service that runs with Restricted Caller’s Rights:

  1. Grant the USE AI FUNCTIONS privilege to the role used in the SPCS session (for example, CHATBOT_USER_ROLE):

    GRANT USE AI FUNCTIONS ON ACCOUNT TO ROLE CHATBOT_USER_ROLE;
  2. Grant the caller version of the privilege to the service owner role:

    GRANT CALLER USE AI FUNCTIONS ON ACCOUNT TO ROLE <service_owner_role>;

CORTEX_USER database role

The CORTEX_USER database role in the SNOWFLAKE database includes the privileges that allow users to call Snowflake Cortex AI Functions. By default, the CORTEX_USER role is granted to the PUBLIC role. The PUBLIC role is automatically granted to all users and roles, so this allows all users in your account to use the Snowflake Cortex AI functions.

If you don’t want all users to have this privilege, you can revoke access to the PUBLIC role and grant access to other roles. The SNOWFLAKE.CORTEX_USER database role cannot be granted directly to a user. For more information, see Using SNOWFLAKE database roles.

要从 PUBLIC 角色中撤销 CORTEX_USER 数据库角色,请使用 ACCOUNTADMIN 角色运行以下命令:

REVOKE DATABASE ROLE SNOWFLAKE.CORTEX_USER
  FROM ROLE PUBLIC;

REVOKE IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE
  FROM ROLE PUBLIC;

You can then selectively provide access to specific roles. A user with the ACCOUNTADMIN role can grant this role to a custom role in order to allow users to access Cortex AI functions. In the following example, use the ACCOUNTADMIN role and grant the user some_user the CORTEX_USER database role via the account role cortex_user_role, which you create for this purpose.

USE ROLE ACCOUNTADMIN;

CREATE ROLE cortex_user_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE cortex_user_role;

GRANT ROLE cortex_user_role TO USER some_user;

You can also grant access to Snowflake Cortex AI functions through existing roles commonly used by specific groups of users. (See User roles.) For example, if you have created an analyst role that is used as a default role by analysts in your organization, you can easily grant these users access to Snowflake Cortex AI Functions with a single GRANT statement.

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_USER TO ROLE analyst;

AI_FUNCTIONS_USER database role

The AI_FUNCTIONS_USER database role in the SNOWFLAKE database allows users to call Snowflake Cortex scalar AI functions (all Cortex AI functions except the aggregate functions AI_AGG and AI_SUMMARIZE_AGG) without granting access to Cortex services such as Cortex Agent, Cortex Analyst, Cortex Fine-tuning, or Cortex Search.

Important

Your users need both the USE AI FUNCTIONS account-level privilege plus one of CORTEX_USER and AI_FUNCTIONS_USER database role to call Snowflake Cortex AI functions. Because USE AI FUNCTIONS is granted to the PUBLIC role by default, no additional action is needed for this privilege unless it has been revoked.

AI_FUNCTIONS_USER role is not granted to the PUBLIC role by default. Accountadmin must explicitly grant this role to roles that require access to AI functions. The AI_FUNCTIONS_USER database role cannot be granted directly to users but must be granted to roles that users can assume. For more information, see Using SNOWFLAKE database roles.

The following example creates a custom role, grants the AI_FUNCTIONS_USER database role to it, and assigns the role to a user.

USE ROLE ACCOUNTADMIN;

CREATE ROLE analyst_rl;
GRANT DATABASE ROLE SNOWFLAKE.AI_FUNCTIONS_USER TO ROLE analyst_rl;

GRANT ROLE analyst_rl TO USER some_user;

Alternatively, to give all users access to scalar AI function capabilities, grant the AI_FUNCTIONS_USER role to the PUBLIC role.

USE ROLE ACCOUNTADMIN;

GRANT DATABASE ROLE SNOWFLAKE.AI_FUNCTIONS_USER TO ROLE PUBLIC;

CORTEX_EMBED_USER database role

The CORTEX_EMBED_USER database role in the SNOWFLAKE database includes the privileges that allow users to call the text embedding functions AI_EMBED, EMBED_TEXT_768, and EMBED_TEXT_1024 and to create Cortex Search Services with managed vector embeddings. CORTEX_EMBED_USER allows you to grant embedding privileges separately from other Cortex AI capabilities.

Note

You can create Cortex Search Services with user-provided embeddings without the CORTEX_EMBED_USER role. In that case, you must generate the embeddings yourself, outside of Snowflake, and load them into a table.

Unlike the CORTEX_USER role, the CORTEX_EMBED_USER role is not granted to the PUBLIC role by default. You must explicitly grant this role to roles that require embedding capabilities if you have revoked the CORTEX_USER role. The CORTEX_EMBED_USER database role cannot be granted directly to users but must be granted to roles that users can assume. The following example illustrates this process.

USE ROLE ACCOUNTADMIN;

CREATE ROLE cortex_embed_user_role;
GRANT DATABASE ROLE SNOWFLAKE.CORTEX_EMBED_USER TO ROLE cortex_embed_user_role;

GRANT ROLE cortex_embed_user_role TO USER some_user;

Alternatively, to give all users access to embedding capabilities, grant the CORTEX_EMBED_USER role to the PUBLIC role as follows.

USE ROLE ACCOUNTADMIN;

GRANT DATABASE ROLE SNOWFLAKE.CORTEX_EMBED_USER TO ROLE PUBLIC;

Using AI Functions in stored procedures with EXECUTE AS RESTRICTED CALLER

To use AI Functions inside stored procedures with EXECUTE AS RESTRICTED CALLER, grant the following privileges to the role that created the stored procedure:

GRANT INHERITED CALLER USAGE ON ALL SCHEMAS IN DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;
GRANT INHERITED CALLER USAGE ON ALL FUNCTIONS IN DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;
GRANT CALLER USAGE ON DATABASE snowflake TO ROLE <role_that_created_the_stored_procedure>;

控制模型访问

Snowflake Cortex provides two independent mechanisms to enforce access to models:

You can use the account-level allowlist to control model access across your entire account, or you can use RBAC to control model access on a per-role basis. For maximum flexibility, you can also use both mechanisms together, if you can accept additional management complexity.

账户级别的允许列表参数

You can control model access across your entire account using the CORTEX_MODELS_ALLOWLIST parameter. Supported features respect the value of this parameter and prevent use of models that are not in the allowlist.

The CORTEX_MODELS_ALLOWLIST parameter can be set to 'All', 'None', or to a comma-separated list of model names. Model names are case-sensitive and must be specified in lowercase (for example, 'mistral-large2' rather than 'MISTRAL-LARGE2'). This parameter can only be set at the account level, not at the user or session levels. Only the ACCOUNTADMIN role can set the parameter using the ALTER ACCOUNT command.

示例:

  • 要允许访问所有模型,请执行以下操作:

    ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'All';
  • To allow access to the mistral-large2 and llama3.1-70b models:

    ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'mistral-large2,llama3.1-70b';
  • 要防止访问任何模型,请执行以下操作:

    ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'None';

如下一节所述,使用 RBAC 为特定角色提供超出您在允许列表中指定的访问权限。

基于角色的访问控制 (RBAC)

Although Cortex models are not themselves Snowflake objects, Snowflake lets you create model objects in the SNOWFLAKE.MODELS schema that represent the Cortex models. By applying RBAC to these objects, you can control access to models the same way you would any other Snowflake object. Supported features accept the identifiers of objects in SNOWFLAKE.MODELS wherever a model can be specified.

Tip

To use RBAC exclusively, set CORTEX_MODELS_ALLOWLIST to 'None'.

刷新模型对象和应用程序角色

SNOWFLAKE.MODELS 不会自动填充代表 Cortex 模型的对象。首次设置模型 RBAC 时必须创建这些对象,若需对新模型应用 RBAC 功能则需刷新对象。

作为 ACCOUNTADMIN,运行 SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH 存储过程,以使用代表当前可用 Cortex 模型的对象填充 SNOWFLAKE.MODELS 模式,并创建与模型对应的应用角色。该存储过程还会创建一个覆盖所有模型的 CORTEX-MODEL-ROLE-ALL 角色。

Tip

您可以随时安全地调用 CORTEX_BASE_MODELS_REFRESH;它不会创建重复的对象或角色。

CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();

刷新模型对象后,您可按下述方式验证模型是否已出现在 SNOWFLAKE.MODELS 架构中:

SHOW MODELS IN SNOWFLAKE.MODELS;

返回的模型列表类似于以下内容:

created_onnamemodel_typedatabase_nameschema_nameowner
2025-04-22 09:35:38.558 -0700CLAUDE-4-5-SONNETCORTEX_BASESNOWFLAKEMODELSSNOWFLAKE
2025-04-22 09:36:16.793 -0700LLAMA3.1-405BCORTEX_BASESNOWFLAKEMODELSSNOWFLAKE
2025-04-22 09:37:18.692 -0700OPENAI-GPT-5.2CORTEX_BASESNOWFLAKEMODELSSNOWFLAKE

要验证您能否查看与这些模型关联的应用角色,请使用 SHOW APPLICATION ROLES 命令,如下例所示:

SHOW APPLICATION ROLES IN APPLICATION SNOWFLAKE;

应用程序角色列表类似于以下内容:

created_onnameownercommentowner_role_type
2025-04-22 09:35:38.558 -0700CORTEX-MODEL-ROLE-ALLSNOWFLAKEMODELSAPPLICATION
2025-04-22 09:36:16.793 -0700CORTEX-MODEL-ROLE-LLAMA3.1-405BSNOWFLAKEMODELSAPPLICATION

将应用程序角色授予用户角色

创建模型对象和应用程序角色后,您可以将应用程序角色授予账户中的特定用户角色。

  • 要授予角色访问特定模型的权限,请执行以下操作:

    GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE MY_ROLE;
  • To grant a role access to all current and future models:

    GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-ALL" TO ROLE MY_ROLE;

使用具有支持功能的模型对象

To use model objects with supported Cortex features, specify the identifier of the model object in SNOWFLAKE.MODELS as the model argument. You can use a fully-qualified identifier, a partial identifier, or a simple model name that will be automatically resolved to SNOWFLAKE.MODELS.

  • 使用完全限定的标识符:

    SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');
  • 使用部分标识符:

    USE DATABASE SNOWFLAKE;
    USE SCHEMA MODELS;
    SELECT AI_COMPLETE('LLAMA3.1-70B', 'Hello');
  • Using automatic lookup with a simple model name:

    -- Automatically resolves to SNOWFLAKE.MODELS."LLAMA3.1-70B"
    SELECT AI_COMPLETE('llama3.1-70b', 'Hello');

Using RBAC on the account allowlist

A number of Cortex features accept a model name as a string argument, for example AI_COMPLETE('model', 'prompt'). When you provide a model name:

  1. Cortex first attempts to locate a matching model object in SNOWFLAKE.MODELS. If you provide an unqualified name like 'x', it automatically looks for SNOWFLAKE.MODELS."X".
  2. If the model object is found, RBAC is applied to determine whether the user can use the model.
  3. If no model object is found, the provided string is matched against the account-level allowlist.

The following example illustrates the use of allowlist and RBAC together. In this example, the allowlist is set to allow the mistral-large2 model, and the user has access to the LLAMA3.1-70B model object through RBAC.

-- set up access
USE SECONDARY ROLES NONE;
USE ROLE ACCOUNTADMIN;
ALTER ACCOUNT SET CORTEX_MODELS_ALLOWLIST = 'MISTRAL-LARGE2';
CALL SNOWFLAKE.MODELS.CORTEX_BASE_MODELS_REFRESH();
GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-LLAMA3.1-70B" TO ROLE PUBLIC;

-- test access
USE ROLE PUBLIC;

-- this succeeds because mistral-large2 is in the allowlist
SELECT AI_COMPLETE('MISTRAL-LARGE2', 'Hello');

-- this succeeds because the role has access to the model object
SELECT AI_COMPLETE('SNOWFLAKE.MODELS."LLAMA3.1-70B"', 'Hello');

-- this fails because the first argument is
-- neither an identifier for an accessible model object
-- nor is it a model name in the allowlist
SELECT AI_COMPLETE('claude-sonnet-4-6', 'Hello');

常见陷阱

  • Access to a model (whether by allowlist or RBAC) does not always mean that it can be used. It may still be subject to cross-region, deprecation, or other availability constraints. These restrictions can result in error messages that seem similar to model access errors.
  • Model access controls only govern the use of a model and not the use of a feature itself. A feature can have its own access controls. For example, access to AI_COMPLETE is governed by the CORTEX_USER or AI_FUNCTIONS_USER database role and the USE AI FUNCTIONS account-level privilege. For more information, see Cortex LLM privileges.
  • Not all features support model access controls. For more information about what a feature supports, see the supported features table.
  • Secondary roles can obscure permissions. For example, if a user has ACCOUNTADMIN as a secondary role, all model objects may appear accessible. Disable secondary roles temporarily when verifying permissions.
  • Qualified model object identifiers are quoted and therefore case-sensitive. For more information, see QUOTED_IDENTIFIERS_IGNORE_CASE.

支持的功能

以下功能支持模型访问控制:

Important

Snowflake is in the process of enforcing model RBAC for additional Cortex AI Functions. When the 2026_02 behavior change bundle is enabled, model access controls (both CORTEX_MODELS_ALLOWLIST and model RBAC) will be enforced for additional functions including AI_TRANSCRIBE, AI_EXTRACT, AI_SENTIMENT, AI_TRANSLATE, CLASSIFY_TEXT, SUMMARIZE, EXTRACT_ANSWER, AI_PARSE_DOCUMENT, and AI_REDACT.

特征账户级别允许列表基于角色的访问控制备注
AI_COMPLETE
AI_CLASSIFY若支撑此功能的模型未获允许,错误信息将包含修改允许列表的指引。
AI_FILTER若支撑此功能的模型未获允许,错误信息将包含修改允许列表的指引。
AI_AGG若支撑此功能的模型未获允许,错误信息将包含修改允许列表的指引。
AI_SUMMARIZE_AGG若支撑此功能的模型未获允许,错误信息将包含修改允许列表的指引。
COMPLETE (SNOWFLAKE.CORTEX)
TRY_COMPLETE (SNOWFLAKE.CORTEX)
Cortex REST API
Cortex Playground

可用性

Snowflake Cortex AI functions are available in the following regions. If your region is not listed for a particular function, use cross-region inference.

Note

  • TRY_COMPLETE 函数在相同区域以 COMPLETE 形式提供。
  • The AI_COUNT_TOKENS function is available in all regions for any model, but the models themselves are available only in the regions specified in the tables below.

The following functions and models are available in any region via cross-region inference.

Function | ModelCross Cloud (Any Region)AWS US (Cross-Region)AWS US Commercial Gov (Cross-Region)AWS EU (Cross-Region)AWS APJ (Cross-Region)AWS AU (Cross-Region)Azure US (Cross-Region)Azure EU (Cross-Region)Google Cloud US (Cross-Region)
AI_COMPLETE
claude-opus-4-7
claude-sonnet-4-6
claude-opus-4-6
claude-sonnet-4-5
claude-opus-4-5
claude-haiku-4-5
claude-4-sonnet [legacy]
gemini-3.1-pro*
llama4-maverick
llama4-scout
llama3.1-8b
llama3.1-70b
llama3.3-70b
snowflake-llama-3.3-70b
llama3.1-405b
openai-gpt-5.2
openai-gpt-5.1
openai-gpt-5
openai-gpt-5-mini
openai-gpt-5-nano
openai-gpt-4.1
snowflake-llama-3.1-405b
deepseek-r1
mistral-large2
mixtral-8x7b
mistral-7b
EMBED_TEXT_768
e5-base-v2
snowflake-arctic-embed-m
snowflake-arctic-embed-m-v1.5
EMBED_TEXT_1024
snowflake-arctic-embed-l-v2.0
snowflake-arctic-embed-l-v2.0-8k
nv-embed-qa-4
multilingual-e5-large
voyage-multilingual-2
AI_CLASSIFY TEXT
AI_CLASSIFY IMAGE
AI_EXTRACT
AI_FILTER TEXT
AI_FILTER IMAGE
AI_AGG
AI_REDACT
AI_SENTIMENT
AI_SIMILARITY TEXT
AI_SIMILARITY IMAGE
AI_SUMMARIZE_AGG
AI_TRANSCRIBE
SENTIMENT
ENTITY_SENTIMENT
EXTRACT_ANSWER
SUMMARIZE
TRANSLATE
AI_TRANSLATE

***** Indicates a preview function or model. Preview features are not suitable for production workloads.

The following Snowflake Cortex AI functions and models are available in the following extended regions.

Function |  模型AWS US East 2 (Ohio)AWS CA Central 1 (Central)AWS SA East 1 (São Paulo)AWS Europe West 2 (London)AWS Europe Central 1 (Frankfurt)AWS Europe North 1 (Stockholm)AWS AP Northeast 1 (Tokyo)AWS AP South 1 (Mumbai)AWS AP Southeast 2 (Sydney)AWS AP Southeast 3 (Jakarta)Azure South Central US (Texas)Azure West US 2 (Washington)Azure UK South (London)Azure North Europe (Ireland)Azure Switzerland North (Zürich)Azure Central India (Pune)Azure Japan East (Tokyo, Saitama)Azure Southeast Asia (Singapore)Azure Australia East (New South Wales)Google Cloud Europe West 2 (London)Google Cloud Europe West 4 (Netherlands)Google Cloud US Central 1 (Iowa)Google Cloud US East 4 (N. Virginia)
EMBED_TEXT_768
|  snowflake-arctic-embed-m-v1.5
|  snowflake-arctic-embed-m |
EMBED_TEXT_1024
|  multilingual-e5-large |
AI_TRANSCRIBE跨区域跨区域跨区域跨区域跨区域跨区域跨区域跨区域跨区域

The following table lists availability of legacy models. These models have not been deprecated and can still be used. However, Snowflake recommends newer models for new development.

Legacy

Function (Model)AWS US West 2 (Oregon)AWS US East 1 (N. Virginia)AWS Europe Central 1 (Frankfurt)AWS Europe West 1 (Ireland)AWS AP Southeast 2 (Sydney)AWS AP Northeast 1 (Tokyo)Azure East US 2 (Virginia)Azure West Europe (Netherlands)
COMPLETE
|  llama3-8b
|  llama3-70b
|  mistral-large

成本注意事项

Snowflake Cortex AI functions incur compute cost based on the number of tokens processed. Refer to the Snowflake Service Consumption Table for each function’s cost in credits per million tokens.

词元是 Snowflake Cortex AI 函数处理的最小文本单位,大约等于四个字符的文本。原始输入或输出文本与词元的等价性可能因模型而异。

  • For functions that generate new text using provided text (AI_COMPLETE, AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SUMMARIZE, and AI_TRANSLATE, and their previous versions in the SNOWFLAKE.CORTEX schema), both input and output tokens are billable.
  • For Cortex Guard, only input tokens are counted. The number of input tokens is based on the number of tokens output from AI_COMPLETE (or COMPLETE). Cortex Guard usage is billed in addition to the cost of the AI_COMPLETE (or COMPLETE) function.
  • For AI_SIMILARITY, AI_EMBED, and the SNOWFLAKE.CORTEX.EMBED_* functions, only input tokens are counted.
  • For EXTRACT_ANSWER, the number of billable tokens is the sum of the number of tokens in the from_text and question fields.
  • AI_CLASSIFY, AI_FILTER, AI_AGG, AI_SENTIMENT, AI_SUMMARIZE_AGG, SUMMARIZE, TRANSLATE, AI_TRANSLATE, EXTRACT_ANSWER, ENTITY_SENTIMENT, and SENTIMENT add a prompt to the input text in order to generate the response. As a result, the billed token count is higher than the number of tokens in the text you provide.
  • AI_CLASSIFY 标签、描述和示例会作为每条已处理记录的输入词元进行计算,而不仅针对每次 AI_CLASSIFY 调用计算一次。
  • 对于 PARSE_DOCUMENT (SNOWFLAKE.CORTEX),按处理的文档页数计费。
  • TRY_COMPLETE (SNOWFLAKE.CORTEX) does not incur costs for error handling. If the TRY_COMPLETE(SNOWFLAKE.CORTEX) function returns NULL, no cost is incurred.
  • For AI_EXTRACT, both input and output tokens are counted. The responseFormat argument is counted as input tokens. For document formats consisting of pages, the number of pages processed is counted as input tokens. Each page in a document is counted as 970 tokens.
  • AI_COUNT_TOKENS incurs only compute cost to run the function. No additional token-based costs are incurred.

对于支持图像或音频等媒体文件的模型:

  • 音频文件按每秒钟音频 50 个令牌计费。
  • The token equivalence of images is determined by the model used.

Snowflake recommends executing queries that call a Snowflake Cortex AI Function with a smaller warehouse (no larger than MEDIUM). Larger warehouses do not increase performance. The cost associated with keeping a warehouse active continues to apply when executing a query that calls a Snowflake Cortex LLM Function. For general information on compute costs, see Understanding compute cost.

Warehouse sizing

Snowflake recommends using a warehouse size no larger than MEDIUM when calling Snowflake Cortex AI Functions. Using a larger warehouse than necessary does not increase performance, but can result in unnecessary costs. This recommendation may change in the future as we continue to evolve Cortex AI Functions.

跟踪 AI 服务的成本

To track credits used for AI Services including LLM Functions in your account, use the METERING_HISTORY view:

SELECT *
  FROM SNOWFLAKE.ACCOUNT_USAGE.METERING_DAILY_HISTORY
  WHERE SERVICE_TYPE='AI_SERVICES';

Track credit consumption for Cortex AI Functions

To view the credit and token consumption for each AI Function call, use the CORTEX_FUNCTIONS_USAGE_HISTORY view:

SELECT *
  FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_USAGE_HISTORY;

您还可以在 Snowflake 账户中查看每次查询的 credit 和令牌使用量。查看每次查询的 credit 和令牌使用量可帮助您确定使用 credit 和令牌最多的查询。

The following example query uses the CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY view to show the credit and token consumption for all of your queries within your account.

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY;

您还可以使用同一视图查看特定查询的 Credit和词元使用量。

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_QUERY_USAGE_HISTORY
WHERE query_id='<query-id>';

Note

您无法获取 REST API 请求的详细使用信息。

查询使用历史记录按查询中使用的模型分组。例如,如果您运行了以下命令:

SELECT AI_COMPLETE('mistral-7b', 'Is a hot dog a sandwich'), AI_COMPLETE('mistral-large', 'Is a hot dog a sandwich');

The query usage history would show two rows, one for mistral-7b and one for mistral-large.

模型限制

Models used by Snowflake Cortex have limitations on size as described in the table below. Sizes are given in tokens. According to industry estimates, tokens generally represent about four characters of text, so the number of words corresponding to a token limit is less than the number of tokens. Inputs exceeding the context window limit result in an error. Output that exceed the context window limit is truncated.

模型可以生成的输出的最大大小受以下限制:

  • 模型的输出词元限制。
  • 模型使用输入词元后上下文窗口中的可用空间。

For example, claude-sonnet-4-6 has a context window of 1,000,000 tokens. If 100,000 tokens are used for the input, the model can generate up to 8,192 tokens. However, if 195,000 tokens are used as input, then the model can only generate up to 5,000 tokens for a total of 200,000 tokens.

Important

在 AWS AP 东南部 2(悉尼)区域:

  • the context window for llama3-8b and mistral-7b is 4,096 tokens.
  • the context window for llama3.1-8b is 16,384 tokens.
  • 来自 SUMMARIZE 函数的 Snowflake 托管模型的上下文窗口为 4,096 个词元。

在 AWS 欧洲西部 1(爱尔兰)区域:

  • the context window for llama3.1-8b is 16,384 tokens.
  • the context window for mistral-7b is 4,096 tokens.
函数模型上下文窗口(词元)最大输出 AISQL 函数(词元)
COMPLETEllama4-maverick128,0008,192
llama4-scout128,0008,192
deepseek-r132,7688,192
claude-sonnet-4-61,000,00064,000
claude-opus-4-71,000,000128,000
claude-opus-4-61,000,000128,000
claude-sonnet-4-5200,00064,000
claude-haiku-4-5200,00064,000
claude-opus-4-5200,00064,000
gemini-3.1-pro1,000,00064,000
mistral-large32,0008,192
mistral-large2128,0008,192
openai-gpt-5.1272,0008,192
openai-gpt-5272,0008,192
openai-gpt-5-mini272,0008,192
openai-gpt-5-nano272,0008,192
openai-gpt-4.1128,00032,000
mixtral-8x7b32,0008,192
llama3.1-8b128,0008,192
llama3.1-70b128,0008,192
llama3.3-70b128,0008,192
snowflake-llama-3.3-70b128,0008,192
llama3.1-405b128,0008,192
snowflake-llama-3.1-405b8,0008,192
mistral-7b32,0008,192
EMBED_TEXT_768e5-base-v2512不适用
snowflake-arctic-embed-m512不适用
EMBED_TEXT_1024nv-embed-qa-4512不适用
multilingual-e5-large512不适用
voyage-multilingual-232,000不适用
AI_TRANSCRIBEarctic-extract128,000512
AI_FILTERSnowflake 托管模型128,000不适用
AI_CLASSIFY TEXTSnowflake 托管模型128,000不适用
AI_AGGSnowflake 托管模型128,000 per row can be used across multiple rows8,192
AI_SENTIMENTSnowflake 托管模型2,048不适用
AI_SUMMARIZE_AGGSnowflake 托管模型128,000 per row can be used across multiple rows8,192
ENTITY_SENTIMENTSnowflake 托管模型2,048不适用
EXTRACT_ANSWERSnowflake 托管模型2,048 for text 64 for question不适用
SENTIMENTSnowflake 托管模型512不适用
SUMMARIZESnowflake 托管模型32,0004,096
TRANSLATESnowflake 托管模型4,096不适用

选择模型

The Snowflake Cortex AI_COMPLETE function supports multiple models of varying capability, latency, and cost. These models have been carefully chosen to align with common customer use cases. To achieve the best performance per credit, choose a model that’s a good match for the content size and complexity of your task. Here are brief overviews of the available models.

大型模型

If you’re not sure where to start, try the most capable models first to establish a baseline to evaluate other models. claude-sonnet-4-6 and mistral-large2 are the most capable models offered by Snowflake Cortex, and will give you a good idea what a state-of-the-art model can do.

  • Claude 4-6 Sonnet is a leader in general reasoning and multimodal capabilities. It outperforms its predecessors in tasks that require reasoning across different domains and modalities. You can use its large output capacity to get more information from either structured or unstructured queries. Its reasoning capabilities and large context windows make it well-suited for agentic workflows.
  • deepseek-r1 is a foundation model trained using large-scale reinforcement-learning (RL) without supervised fine-tuning (SFT). It can deliver high performance across math, code, and reasoning tasks. To access the model, set the cross-region inference parameter to AWS_US.
  • mistral-large2 is Mistral AI’s most advanced large language model with top-tier reasoning capabilities. Compared to mistral-large, it’s significantly more capable in code generation, mathematics, reasoning, and provides much stronger multilingual support. It’s ideal for complex tasks that require large reasoning capabilities or are highly specialized, such as synthetic text generation, code generation, and multilingual text analytics.
  • snowflake-llama3.1-405b is a model derived from the open source llama3.1 model. It uses the SwiftKV optimizations developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.

中型模型

  • llama3.1-70b is an open source model that demonstrates state-of-the-art performance ideal for chat applications, content creation, and enterprise applications. It is a highly performant, cost effective model that enables diverse use cases with a context window of 128K. llama3-70b is still supported and has a context window of 8K.
  • snowflake-llama3.3-70b is a model derived from the open source llama3.3 model. It uses the SwiftKV optimizations developed by the Snowflake AI research team to deliver up to a 75% inference cost reduction. SwiftKV achieves higher throughput performance with minimal accuracy loss.
  • mixtral-8x7b is ideal for text generation, classification, and question answering. Mistral models are optimized for low latency with low memory requirements, which translates into higher throughput for enterprise use cases.

小型模型

  • llama3.1-8b is ideal for tasks that require low to moderate reasoning. It’s a light-weight, ultra-fast model with a context window of 128K. llama3-8b provides a smaller context window and relatively lower accuracy.
  • mistral-7b is ideal for your simplest summarization, structuration, and question answering tasks that need to be done quickly. It offers low latency and high throughput processing for multiple pages of text with its 32K context window.

下表介绍了常见模型在各种基准测试中的表现,包括 Snowflake Cortex COMPLETE 提供的模型和其他一些常见模型。

模型Context Window (Tokens)MMLU (Reasoning)HumanEval (Coding)GSM8K (Arithmetic Reasoning)Spider 1.0 (SQL)
GPT 4.o (https://openai.com/index/hello-gpt-4o/)128,00088.790.296.4-
Claude 3.5 Sonnet (https://www.anthropic.com/claude)200,00088.392.096.4-
llama3.1-405b (https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md)128,00088.68996.8-
llama3.1-70b (https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md)128,0008680.595.1-
mistral-large2 (https://mistral.ai/news/mistral-large-2407/)128,000849293-
llama3.1-8b (https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md)128,0007372.684.9-
mixtral-8x7b (https://mistral.ai/news/mixtral-of-experts/)32,00070.640.260.4-
Snowflake Arctic4,09667.364.369.779
mistral-7b (https://mistral.ai/news/announcing-mistral-7b/)32,00062.526.252.1-
GPT 3.5 Turbo*4,0977048.157.1-

以前的模型版本

Snowflake Cortex COMPLETE 函数还支持以下较旧的模型版本。我们建议使用最新的模型版本,而不是此表中列出的版本。

模型Context Window (Tokens)MMLU (Reasoning)HumanEval (Coding)GSM8K (Arithmetic Reasoning)Spider 1.0 (SQL)
mistral-large (https://mistral.ai/news/mistral-large/)32,00081.245.18181
llama-2-70b-chat4,09668.930.557.5-

Using Snowflake Cortex AI Functions with Python

Call Cortex AI Functions in Snowpark Python

You can use Snowflake Cortex AI Functions in the Snowpark Python API. These functions include the following. Note that the functions in Snowpark Python have names in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.

ai_agg example

The ai_agg function aggregates a column of text using natural language instructions in a similar manner to how you would ask an analyst to summarize or extract findings from grouped or ungrouped data.

The following example summarizes customer reviews for each product using the ai_agg function. The function takes a column of text and a natural language instruction to summarize the reviews.

from snowflake.snowpark.functions import ai_agg, col

df = session.create_dataframe([
    [1, "Excellent product!"],
    [1, "Great battery life."],
    [1, "A bit expensive but worth it."],
    [2, "Terrible customer service."],
    [2, "Won’t buy again."],
], schema=["product_id", "review"])

# Summarize reviews per product
summary_df = df.group_by("product_id").agg(
    ai_agg(col("review"), "Summarize the customer reviews in one sentence.")
)
summary_df.show()

Note

Use task descriptions that are detailed and centered on the use case. For example, “Summarize the customer feedback for an investor report”.

Classify text with ai_classify

The ai_classify function takes a string or image and classifies it into the categories that you define.

以下示例将旅行评论分为“travel”和“cooking”等类别。该函数采用一列文本和一个类别列表来对文本进行分类。

from snowflake.snowpark.functions import ai_classify, col

df = session.create_dataframe([
    ["I dream of backpacking across South America."],
    ["I made the best pasta yesterday."],
], schema=["sentence"])

df = df.select(
    "sentence",
    ai_classify(col("sentence"), ["travel", "cooking"]).alias("classification")
)
df.show()

Note

最多可以提供 500 个类别。可以对文本和图像进行分类。

Filter rows with ai_filter

The ai_filter function evaluates a natural language condition and returns True or False. You can use it to filter or tag rows.

from snowflake.snowpark.functions import ai_filter, prompt, col

df = session.create_dataframe(["Canada", "Germany", "Japan"], schema=["country"])

filtered_df = df.select(
    "country",
    ai_filter(prompt("Is {0} in Asia?", col("country"))).alias("is_in_asia")
)
filtered_df.show()

Note

You can filter on both strings and files. For dynamic prompts, use the prompt function. For more information, see Snowpark Python reference.

Call Cortex AI Functions in Snowflake ML

Snowflake ML contains the older AI Functions, those with names that don’t begin with “AI”. These functions are supported in version 1.1.2 and later of Snowflake ML. The names are rendered in Pythonic “snake_case” format, with words separated by underscores and all letters in lowercase.

If you run your Python script outside of Snowflake, you must create a Snowpark session to use these functions. See Connecting to Snowflake for instructions.

Process single values

以下 Python 示例演示了如何对单个值调用 Snowflake Cortex AI 函数:

from snowflake.cortex import complete, extract_answer, sentiment, summarize, translate

text = """
    The Snowflake company was co-founded by Thierry Cruanes, Marcin Zukowski,
    and Benoit Dageville in 2012 and is headquartered in Bozeman, Montana.
"""

print(complete("llama3.1-8b", "how do snowflakes get their unique patterns?"))
print(extract_answer(text, "When was snowflake founded?"))
print(sentiment("I really enjoyed this restaurant. Fantastic service!"))
print(summarize(text))
print(translate(text, "en", "fr"))

Pass hyperparameter options

You can pass options that affect the model’s hyperparameters when using the complete function. The following Python example illustrates modifying the maximum number of output tokens that the model can generate:

from snowflake.cortex import complete, CompleteOptions

model_options1 = CompleteOptions(
    {'max_tokens':30}
)

print(complete("llama3.1-8b", "how do snowflakes get their unique patterns?", options=model_options1))

Call functions on table columns

You can call an AI function on a table column, as shown below. This example requires a session object (stored in session) and a table articles containing a text column abstract_text, and creates a new column abstract_summary containing a summary of the abstract.

from snowflake.cortex import summarize
from snowflake.snowpark.functions import col

article_df = session.table("articles")
article_df = article_df.withColumn(
    "abstract_summary",
    summarize(col("abstract_text"))
)
article_df.collect()

Note

高级聊天风格(多消息)形式的 COMPLETE 目前在 Python 中不受支持。

Using Snowflake Cortex AI functions with Snowflake CLI

Snowflake Cortex AI Functions are available in Snowflake CLI version 2.4.0 and later. See Introducing Snowflake CLI for more information about using Snowflake CLI. The functions are the old-style functions, those with names that don’t begin with “AI”.

The following examples illustrate using the snow cortex commands on single values. The -c parameter specifies which connection to use.

Note

The advanced chat-style (multi-message) form of COMPLETE is not currently supported in Snowflake CLI.

snow cortex complete "Is 5 more than 4? Please answer using one word without a period." -c "snowhouse"
snow cortex extract-answer "what is snowflake?" "snowflake is a company" -c "snowhouse"
snow cortex sentiment "Mary had a little Lamb" -c "snowhouse"
snow cortex summarize "John has a car. John's car is blue. John's car is old and John is thinking about buying a new car. There are a lot of cars to choose from and John cannot sleep because it's an important decision for John."
snow cortex translate herb --to pl

You can also use files that contain the text you want to use for the commands. For this example, assume that the file about_cortex.txt contains the following content:

Snowflake Cortex gives you instant access to industry-leading large language models (LLMs) trained by researchers at companies like Anthropic, Mistral, Reka, Meta, and Google, including Snowflake Arctic, an open enterprise-grade model developed by Snowflake.

Since these LLMs are fully hosted and managed by Snowflake, using them requires no setup. Your data stays within Snowflake, giving you the performance, scalability, and governance you expect.

Snowflake Cortex features are provided as SQL functions and are also available in Python. The available functions are summarized below.

COMPLETE: Given a prompt, returns a response that completes the prompt. This function accepts either a single prompt or a conversation with multiple prompts and responses.
EMBED_TEXT_768: Given a piece of text, returns a vector embedding that represents that text.
EXTRACT_ANSWER: Given a question and unstructured data, returns the answer to the question if it can be found in the data.
SENTIMENT: Returns a sentiment score, from -1 to 1, representing the detected positive or negative sentiment of the given text.
SUMMARIZE: Returns a summary of the given text.
TRANSLATE: Translates given text from any supported language to any other.

You can then execute the snow cortex summarize command by passing in the filename using the --file parameter, as shown:

snow cortex summarize --file about_cortex.txt
Snowflake Cortex offers instant access to industry-leading language models, including Snowflake Arctic, with SQL functions for completing prompts (COMPLETE), text embedding (EMBED\_TEXT\_768), extracting answers (EXTRACT\_ANSWER), sentiment analysis (SENTIMENT), summarizing text (SUMMARIZE), and translating text (TRANSLATE).

For more information about these commands, see snow cortex commands.

法律声明

The data classification of inputs and outputs are as set forth in the following table.

Input data classificationOutput data classification
Usage DataCustomer Data

For additional information, refer to Snowflake AI and ML.