Categories:

String & binary functions (Large Language Model)

COUNT_TOKENS (SNOWFLAKE.CORTEX)

Returns the number of tokens in a prompt for the large language model or the task-specific function specified in the argument. This function does not support fine-tuned models.

Note

We are working on more accurate token count estimation for functions such as ai_classify and ai_filter.

Syntax

SNOWFLAKE.CORTEX.COUNT_TOKENS( <model_name> , <input_text> )
Copy

Arguments

Required:

model_name

Name of the model you want to base the token count on. Specify one of the following values:

  • deepseek-r1

  • e5-base-v2

  • e5-large-v2

  • gemma-7b

  • jamba-1.5-large

  • jamba-1.5-mini

  • jamba-instruct

  • llama2-70b-chat

  • llama3-70b

  • llama3-8b

  • llama3.1-405b

  • llama3.1-70b

  • llama3.1-8b

  • llama3.2-1b

  • llama3.2-3b

  • llama3.3-70b

  • llama4-maverick

  • llama4-scout

  • mistral-7b

  • mistral-large

  • mistral-large2

  • mixtral-8x7b

  • nv-embed-qa-4

  • reka-core

  • reka-flash

  • snowflake-arctic-embed-l-v2.0

  • snowflake-arctic-embed-m-v1.5

  • snowflake-arctic-embed-m

  • snowflake-arctic

  • snowflake-llama-3.1-405b

  • snowflake-llama-3.3-70b

  • voyage-multilingual-2

input_text

Input text to count the tokens in.

Returns

Returns an INT , INTEGER , BIGINT , SMALLINT , TINYINT , BYTEINT type that is the number of tokens in the input text based on the model or function specified.

Usage notes

  • If a function name is specified, the token count is based on the model used by the function.

  • Use lowercase letters in function names.

Note

COUNT_TOKENS does not account for the managed system prompt that is automatically added to the beginning of the input text when using a Cortex AISQL functions. As a result, the value returned by COUNT_TOKENS is lower than the actual number of tokens processed by these functions.

Examples

The following example returns the token count for the specified prompt using the llama3.1-70b model:

SELECT SNOWFLAKE.CORTEX.COUNT_TOKENS( 'llama3.1-70b', 'what is a large language model?' );
Copy
+---+
| 6 |
+---+
Language: English