- Categories:
String & binary functions (AI Functions)
AI_COMPLETE (Single string)¶
Note
AI_COMPLETE is the updated version of COMPLETE (SNOWFLAKE.CORTEX). For the latest functionality, use AI_COMPLETE.
Generates a response (completion) for a text prompt using a supported language model.
Syntax¶
The function contains two required arguments and four optional arguments. The function can be used with either positional or named argument syntax.
Using AI_COMPLETE with a single string input
Arguments¶
modelA string specifying the model to be used. Specify one of the following models:
claude-4-opusclaude-4-sonnetclaude-3-7-sonnetclaude-3-5-sonnetdeepseek-r1llama3-8bllama3-70bllama3.1-8bllama3.1-70bllama3.1-405bllama3.3-70bllama4-maverickllama4-scoutmistral-largemistral-large2mistral-7bmixtral-8x7bopenai-gpt-4.1openai-o4-minisnowflake-arcticsnowflake-llama-3.1-405bsnowflake-llama-3.3-70b
Supported models might have different costs.
promptA string prompt
model_parametersAn object containing zero or more of the following options that affect the model’s hyperparameters. See LLM Settings (https://www.promptingguide.ai/introduction/settings).
temperature: A value from 0 to 1 (inclusive) that controls the randomness of the output of the language model. A higher temperature (for example, 0.7) results in more diverse and random output, while a lower temperature (such as 0.2) makes the output more deterministic and focused.Default: 0
top_p: A value from 0 to 1 (inclusive) that controls the randomness and diversity of the language model, generally used as an alternative totemperature. The difference is thattop_prestricts the set of possible tokens that the model outputs, whiletemperatureinfluences which tokens are chosen at each step.Default: 0
max_tokens: Sets the maximum number of output tokens in the response. Small values can result in truncated responses.Default: 4096 Maximum allowed value: 8192
guardrails: Filters potentially unsafe and harmful responses from a language model using Cortex Guard. Either TRUE or FALSE.Default: FALSE
response_formatThe format that the response should follow. You can specify the response format as:
A JSON schema (https://json-schema.org/) that the response should follow. This is a SQL sub-object, not a string.
A SQL type literal beginning with the TYPE keyword. The defined type must use an OBJECT as its top-level container, and fields of this OBJECT are mapped to corresponding JSON fields and values.
If
response_formatis not specified, the response is a string containing either the response or a serialized JSON object containing the response and information about it.For more information, see AI_COMPLETE structured outputs.
show_detailsA boolean flag that indicates whether to return a serialized JSON object containing the response and information about it.
Returns¶
When the show_details argument is not specified or set to FALSE and the response_format is not specified or set to NULL, returns a string containing the response.
When the show_details argument is not specified or set to FALSE and the response_format is specified, returns an object following the provided response format.
When the show_details argument is set to TRUE and the response_format is not specified, returns a
a JSON object containing the following keys.
"choices": An array of the model’s responses. (Currently, only one response is provided.) Each response is an object containing a"messages"key whose value is the model’s response to the latest prompt."created": UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model": The name of the model that created the response."usage": An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens": The number of tokens in the generated response."prompt_tokens": The number of tokens in the prompt."total_tokens": The total number of tokens consumed, which is the sum of the other two values.
When the show_details argument is set to TRUE and the response_format is specified, returns a
a JSON object containing the following keys
"structured_output": A json object following the specified response format."created": UNIX timestamp (seconds since midnight, January 1, 1970) when the response was generated."model": The name of the model that created the response."usage": An object recording the number of tokens consumed and generated by this completion. Includes the following sub-keys:"completion_tokens": The number of tokens in the generated response."prompt_tokens": The number of tokens in the prompt."total_tokens": The total number of tokens consumed, which is the sum of the other two values.
Examples¶
Single response¶
To generate a single response:
Responses from table column¶
The following example generates a response for each row in the reviews table, using the content column as input. Each query result contains a critique of the corresponding review.
Tip
As shown in this example, you can use tagging in the prompt to control the kind of response generated. See A guide to prompting LLaMA 2 (https://replicate.com/blog/how-to-prompt-llama) for tips.
Controlling model parameters¶
The following example specifies the model_parameters used to provide a response.
The response is a string containing the message from the language model and other information. Note that the response
is truncated as instructed in the model_parameters argument.
Detailed output¶
The following example shows how you can use the show_details argument to return additional inference details.
The response is a JSON object with the model’s message and related details. The options argument was used to truncate the output.
Specifying a JSON response format¶
This example illustrates the use of the function’s response_format argument to return a structured response by providing a type literal.
The response is a JSON object following the structured response format.
Response:
Specifying a JSON response format with details, using a type literal¶
This example illustrates the use of response_format argument to return a structured response combined with show_details to get additional inference information, using a type literal.
The response is a JSON object containing structured response with additional inference metadata.
Specifying a JSON response format with details, using a JSON schema¶
This example illustrates the use of the function’s response_format argument to return a structured response combined with show_details to get additional inference information, using a JSON schema.
The response is a json object containing structured response with additional inference metadata.