Export Document AI model builds

You can export Document AI model builds to an internal stage. As a result, the document files are exported and the annotations file is generated. You can then use the exported data for various purposes, such as creating Snowflake Datasets and extracting information using the AI_EXTRACT function.

Prerequisites

  • To use Document AI, you must have the privileges required. For more information about the privileges, see Setting up Document AI.

  • To export a Document AI model build, you must have the WRITE privilege on a target stage.

    Note

    The target stage must be an internal stage.

Export a Document AI model build

  1. Sign in to Snowsight.

  2. In the navigation menu, select AI & ML » Document AI.

  3. Select a warehouse.

    The list of existing model builds appears.

  4. Select the (more) menu next to the model build name, and then select Export.

  5. In the Export Build dialog that appears, select a target stage from the list, and then confirm by selecting Export.

  6. When the export process is complete, close the dialog by selecting Close.

    Note

    You can close the dialog before the export process is complete. Closing the dialog doesn’t cancel the export process.

    The model build is exported to the target stage. This means that the target stage directory now contains all documents from the latest version of that Document AI model build, and the annotations.jsonl file.

The annotations file

When you export a Document AI model build, the annotations.jsonl file is generated in the target stage directory. For each document you export, the file contains the following information:

  • file: Filename identifier

  • prompt: JSON schema that describes the prompts

  • annotatedResponse: User responses in a format consistent with the schema

  • modelResponse: Responses that weren’t modified by the user

Consider the following line example of the annotations.jsonl file:

{
  "file": "5d8c22ebe1e9a9b4bc92f611c02a745b_00.pdf",
  "prompt": {
    "type": "object",
    "properties": {
      "information": {
        "description": "Employee information",
        "type": "object",
        "properties": {
          "name": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "address": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "city": {
            "type": "array",
            "items": {
              "type": "string"
            }
          }
        }
      },
      "data": {
        "description": "",
        "type": "object",
        "properties": {
          "ssid": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "employeeid": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "startdate": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "enddate": {
            "type": "array",
            "items": {
              "type": "string"
            }
          }
        }
      },
      "deductions": {
        "description": "",
        "type": "object",
        "properties": {
          "deductions name": {
            "type": "array",
            "items": {
              "type": "string"
            }
          },
          "current": {
            "type": "array",
            "items": {
              "type": "string"
            }
          }
        }
      }
    }
  },
  "annotatedResponse": {
    "information": {
      "name": [
        "John Doe"
      ],
      "address": [
        "Dakota Avenue Powder River, WY 82648"
      ],
      "city": [
        "Powder River, WY 82648"
      ]
    },
    "data": {
      "ssid": [
        "123-45-6789"
      ],
      "employeeid": [
        "34528"
      ],
      "startdate": [
        "06/15/2018"
      ],
      "enddate": [
        "06/30/2018"
      ]
    },
    "deductions": {
      "deductions name": [
        "Federal Tax",
        "Wyoming State Tax",
        "SDI",
        "Soc Sec / OASDI",
        "Health Insurance Tax",
        "None"
      ],
      "current": [
        "82.50",
        "64.08",
        "None",
        "13.32",
        "91.74",
        "21.46"
      ]
    }
  },
  "modelResponse": {}
}

Work with the exported data

After you export a Document AI model build, you can create a table with the exported data for further processing:

  1. Create a file format for the annotations file:

    CREATE OR REPLACE FILE FORMAT my_json
      TYPE = 'JSON';
    
    Copy
  2. Create a table:

    CREATE OR REPLACE TABLE exported_data_table AS (
       SELECT
          input_file.$1:file AS file,
          input_file.$1:prompt AS prompt,
          input_file.$1:annotatedResponse AS response
       FROM '@docai_db.docai_schema.docai_stage/docai_test_2025_10_03_16_00_10/annotations.jsonl' (FILE_FORMAT => my_json) input_file
       WHERE response != '{}'
    );
    
    Copy

You can now either convert the exported data to a Dataset for further use in Snowflake, or run the AI_EXTRACT function using that data:

  • Create a Dataset for the exported data:

    CREATE DATASET my_dataset;
    
    ALTER DATASET my_dataset
    ADD VERSION 'v2' FROM (
      SELECT
        CONCAT('@docai_db.docai_schema.docai_stage/docai_test_2025_10_03_16_00_10/', file) AS file,
        prompt,
        response
      FROM exported_data_table
    );
    
    Copy

    For more information about Datasets, see Snowflake Datasets.

  • Run AI_EXTRACT using the exported data:

    SELECT
    AI_EXTRACT (
      file => TO_FILE('@docai_db.docai_schema.docai_stage/docai_test_2025_10_03_16_00_10', my_table.file),
      responseFormat => PARSE_JSON('{ "schema": ' || TO_VARIANT(my_table.schema) || '}')
      )
    FROM docai_db.docai_schema.exported_data_table AS my_table;
    
    Copy
Language: English