Categories:

String & binary functions (AI Functions)

AI_EXTRACT (Document AI legacy models)

Extracts information from a file using a legacy Document AI model.

语法

AI_EXTRACT ( model => <model> ,
            file => <file> )

实参

model => model

Specifies the Document AI Arctic-TILT model for extraction stored in the Snowflake Model Registry; for example, my_db.my_schema.my_model.

file => file

A FILE for extraction.

返回

实体提取

{
  "error": null,
  "response": {
    "invoice_items": [
      "NEW CRUSHED VELVET DIVAN BED",
      "Vintage Radiator",
      "Solid Wooden Worktop",
      "Sienna Crushed Velvet Curtains"
    ],
    "invoice_number": "123/20",
    "tax_amount": "77.57",
    "total_amount": "465.43 GBP",
    "vendor_name": "UK Exports & Imports Ltd"
  }
}

表提取

{
  "error": null,
  "response": {
    "table1": {
      "gross": ["10", "31", "10"],
      "item": ["apples", "banana", "pear"],
      "net": ["9", "30", "10"],
      "tax": ["1", "1", ""]
    },
    "table2": {
      "name": ["John", "Ana", "Lisa"],
      "surname": ["Smith", "Nixon", "Gonzales"]
    }
  }
}

访问控制要求

Users must use a role that has been granted the SNOWFLAKE.CORTEX_USER database role. For information about granting this privilege, see Cortex LLM privileges.

此外,您还必须拥有模型的 OWNERSHIP 权限。

使用说明

  • The model must be in the Snowflake Model Registry.

  • The Document AI model should not have more than 100 entities.

  • If not set explicitly, the latest available model version is used by default (the version set when the model was published or trained in the Document AI UI). To set the default version of a model, use the ALTER MODEL command as shown in the following example:

    ALTER MODEL my_model SET DEFAULT_VERSION = new_version;
  • 不支持置信度分数。

  • AI_EXTRACT uses token-based billing. For more information on the AI_EXTRACT cost for Document AI legacy models, see the Snowflake Service Consumption Table.

    • Entity extraction cost is labeled as arctic-tilt-entity.
    • Table extraction cost is labeled as arctic-tilt-table.

区域可用性

以下区域可用:

  • AWS 加拿大(中部)
  • AWS EU (法兰克福)
  • AWS EU(爱尔兰)
  • AWS US 东部(弗吉尼亚北部)
  • AWS US 东部(俄亥俄州)
  • AWS US 西部(俄勒冈州)
  • Azure 澳大利亚东部(新南威尔士州)
  • Azure 东部 US 2(弗吉尼亚)
  • Azure 东南亚(新加坡)
  • Azure 欧洲西部(荷兰)
  • Azure 西部 US 2(华盛顿)

If your region is not listed, use cross-region inference.

示例

The following example extracts the features defined in the Document AI model:

SELECT AI_EXTRACT(
  model => 'my_db.my_schema.my_model',
  file => TO_FILE('@files_db.files_schema.files', 'agreement.pdf')
);

以下示例从暂存区上目录的所有文件中提取信息:

SELECT AI_EXTRACT(
  model => 'my_db.my_schema.my_model',
  file => TO_FILE('@db.schema.files', relative_path)
) FROM DIRECTORY (@db.schema.files);

法律声明

Refer to Snowflake AI and ML.