Apache Iceberg™ tables¶
Apache Iceberg™ tables for Snowflake combine the performance and query semantics of typical Snowflake tables with external cloud storage that you manage. They are ideal for existing data lakes that you cannot, or choose not to, store in Snowflake.
Iceberg tables use the Apache Iceberg™ (https://iceberg.apache.org/) open table format specification, which provides an abstraction layer on data files stored in open formats and supports features such as:
- ACID(原子性、一致性、隔离性、持久性)事务
- 架构演化
- 隐藏式分区
- 表快照
Snowflake supports Iceberg tables that use the Apache Parquet™ (https://parquet.apache.org/) file format.
开始使用
To get started with Iceberg tables, see Tutorial: Create your first Apache Iceberg™ table.
工作原理
This section provides information specific to working with Iceberg tables in Snowflake. To learn more about the Iceberg table format specification, see the official Apache Iceberg documentation (https://iceberg.apache.org/docs/latest/) and the Iceberg Table Spec (https://iceberg.apache.org/spec/).
数据存储
Iceberg tables store their data and metadata files in an external cloud storage location (Amazon S3, Google Cloud Storage, or Azure Storage). The external storage is not part of Snowflake. You are responsible for all management of the external cloud storage location, including the configuration of data protection and recovery. Snowflake does not provide Fail-safe storage for Iceberg tables.
Snowflake connects to your storage location using an external volume, and Iceberg tables incur no Snowflake storage costs. For more information, see 计费.
To learn more about storage for Iceberg tables, see Storage for Apache Iceberg™ tables.
外部卷
An external volume is a named, account-level Snowflake object that you use to connect Snowflake to your external cloud storage for Iceberg tables. An external volume stores an identity and access management (IAM) entity for your storage location. Snowflake uses the IAM entity to securely connect to your storage for accessing table data, Iceberg metadata, and manifest files that store the table schema, partitions, and other metadata.
A single external volume can support one or more Iceberg tables.
To set up an external volume for Iceberg tables, see Configure an external volume.
目录
An Iceberg catalog enables a compute engine to manage and load Iceberg tables. The catalog forms the first architectural layer in the Iceberg table specification (https://iceberg.apache.org/spec/#overview) and must support:
- Storing the current metadata pointer for one or more Iceberg tables. A metadata pointer maps a table name to the location of that table’s current metadata file.
- 执行原子操作,以便您可以更新表的当前元数据指针。
To learn more about Iceberg catalogs, see the Apache Iceberg documentation (https://iceberg.apache.org/terms/#catalog-implementations).
Snowflake supports different catalog options. For example, you can use Snowflake as the Iceberg catalog, or use a catalog integration to connect Snowflake to an external Iceberg catalog.
目录集成
A catalog integration is a named, account-level Snowflake object that stores information about how your table metadata is organized for the following scenarios:
- When you don’t use Snowflake as the Iceberg catalog. For example, you need a catalog integration if your table is managed by AWS Glue.
- When you want to integrate with Snowflake Open Catalog to:
- Query an Iceberg table in Snowflake Open Catalog using Snowflake.
- Sync a Snowflake-managed Iceberg table with Snowflake Open Catalog so that third-party compute engines can query the table.
A single catalog integration can support one or more Iceberg tables that use the same external catalog.
To set up a catalog integration, see Configure a catalog integration.
元数据和快照
Iceberg 使用基于快照的查询模型,在该模型中,使用清单文件和元数据文件映射数据文件。快照表示表在某个时间点的状态,用于访问表中数据文件的完整集合。
To learn about table metadata and Time Travel support, see Metadata and retention for Apache Iceberg™ tables.
跨云/跨区域支持
Snowflake 支持使用与托管您的 Snowflake 账户的云提供商不同的云提供商(位于不同的区域)的外部卷存储位置。
| Table type | Cross-cloud/cross-region support | Notes |
|---|---|---|
| Tables that use an external catalog with a catalog integration | ✔ | If your Snowflake account and external volume are in different regions, your external cloud storage account incurs egress costs when you query the table. |
| Tables that use Snowflake as the catalog | ✔ | 如果 Snowflake 账户和外部卷位于不同区域,则在您查询表时,外部云存储账户会产生出口费用。 These tables incur costs for cross-region data transfer usage. For more information, see 计费. |
计费
Snowflake bills your account for virtual warehouse (compute) usage and cloud services when you work with Iceberg tables. Snowflake also bills your account if you use automated refresh or an external query engine through Snowflake Horizon Catalog.
If a Snowflake-managed Iceberg table is cross-cloud/cross-region, Snowflake bills your cross-region data transfer usage under the TRANSFER_TYPE of DATA_LAKE. To learn more, see:
- DATA_TRANSFER_HISTORY view in the ORGANIZATION_USAGE schema.
- DATA_TRANSFER_HISTORY view in the ACCOUNT_USAGE schema.
Snowflake does not bill your account for the following:
- Iceberg table storage costs when the table uses an external volume that you manage. Your cloud storage provider bills you
directly for data storage usage. However, if the table uses
Snowflake Storage (
EXTERNAL_VOLUME = SNOWFLAKE_MANAGED), Snowflake charges for the storage. For more information, see Snowflake storage for Apache Iceberg™ tables. - Active bytes used by Iceberg tables. However, the INFORMATION_SCHEMA.TABLE_STORAGE_METRICS and ACCOUNT_USAGE.TABLE_STORAGE_METRICS views display ACTIVE_BYTES for Iceberg tables to help you track how much storage a table occupies. To view an example, see Retrieve storage metrics.
Note
如果 Snowflake 账户和外部卷位于不同区域,则在您查询表时,外部云存储账户会产生出口费用。
目录选项
Snowflake 支持以下 Iceberg 目录选项:
- Use Snowflake as the Iceberg catalog
- 使用外部 Iceberg 目录
下表总结了这两种目录选择之间的差异。
| 使用 Snowflake 作为目录 | 使用外部目录 | |
|---|---|---|
| Read access | ✔ | ✔ |
| Write access | ✔ | ✔ |
| Catalog-vended credentials | ✔ | |
| Write access across regions | ✔ | ✔ with Write support for externally managed tables |
| Data and metadata storage | External volume (cloud storage) | External volume (cloud storage) |
| Snowflake platform support | ✔ | |
| Integrates with Snowflake Open Catalog | ✔ You can sync a Snowflake-managed table with Open Catalog to query a table using other compute engines. | ✔ You can use Snowflake to query or write to Iceberg tables managed by Open Catalog. |
| Works with the Snowflake Catalog SDK | ✔ | ✔ |
| Replication for tables | ✔ See Configure replication for Snowflake-managed Apache Iceberg™ tables. |
使用 Snowflake 作为目录¶
An Iceberg table that uses Snowflake as the Iceberg catalog (Snowflake-managed Iceberg table) provides full Snowflake platform support with read and write access. The table data and metadata are stored in external cloud storage, which Snowflake accesses using an external volume. Snowflake handles all life-cycle maintenance, such as compaction, for the table. However, you can disable compaction for the table , if needed.
使用外部目录
使用外部目录的 Iceberg 表提供有限的 Snowflake 平台支持。
With this table type, Snowflake uses a catalog integration to retrieve information about your Iceberg metadata and schema.
您可以使用该选项为以下来源创建 Iceberg 表:
-
Remote Iceberg REST catalog, including AWS Glue and Snowflake Open Catalog. Snowflake supports writes to externally managed tables that use a remote Iceberg REST catalog.
Tip
To bring your external data from a remote Iceberg REST catalog into Snowflake, you can create a catalog-linked database. The database automatically discovers and stays in sync with the namespaces and tables in your remote catalog. You can use a catalog-linked database to read and write to the tables in your remote catalog from Snowflake, while preserving full interoperability with your existing Iceberg ecosystem. For more information, see the following topics:
- Use a catalog-linked database for Apache Iceberg™ tables
- If your external data is in Unity Catalog, see Tutorial: Set up bidirectional access to Apache Iceberg™ tables in Databricks Unity Catalog
- If your external data is in AWS Glue, see Build Data Lakes using Apache Iceberg with Snowflake and AWS Glue
-
Delta table files in object storage (Delta Direct; see CREATE ICEBERG TABLE (Delta files in object storage))
Snowflake 不承担表的任何生命周期管理工作。
The table data and metadata are stored in external cloud storage, which Snowflake accesses using an external volume.
Note
If you want full Snowflake platform support for an Iceberg table that uses an external catalog, you can convert it to use Snowflake as the catalog. For more information, see Convert an Apache Iceberg™ table to use Snowflake as the catalog.
下图显示了 Iceberg 表如何使用与外部 Iceberg 目录的目录集成。
Apache Iceberg™ v3 support¶
Snowflake supports v3 of the Apache Iceberg™ table specification. For details, see Apache Iceberg™ tables: Support for Apache Iceberg™ v3.
注意事项和限制
以下注意事项和限制适用于 Iceberg 表,并且可能会发生变更:
云和区域
- Iceberg 表适用于所有 Snowflake 账户、所有云平台和所有区域。
- Cross-cloud/cross-region tables are supported. For more information, see 跨云/跨区域支持.
Iceberg
Versions 1 and 2 of the Apache Iceberg specification are supported, excluding the following features (https://iceberg.apache.org/spec/):
- Row-level equality deletes. However, tables that use Snowflake as the catalog support Snowflake DELETE statements.
- Using the
history.expire.min-snapshots-to-keeptable property (https://iceberg.apache.org/docs/1.2.1/configuration/#table-behavior-properties) to specify the default minimum number of snapshots to keep. For more information, see 元数据和快照.Iceberg partitioning with the
buckettransform function impacts performance for queries that use conditional clauses to filter results.对于不由 Snowflake 管理的 Iceberg 表,请注意以下几点:
Time travel to any snapshot generated after table creation is supported as long as you periodically refresh the table before the snapshot expires.
Converting a table that has an un-materialized identity partition column isn’t supported. An un-materialized identity partition column is created when a table defines an identity transform using a source column that doesn’t exist in a Parquet file.
For row-level deletes:
Snowflake supports position deletes (https://iceberg.apache.org/spec/#position-delete-files) only for v2 Iceberg tables, and deletion vectors (https://iceberg.apache.org/spec/#deletion-vectors) for v3 Iceberg tables.
Snowflake only supports position deletes with externally managed Iceberg tables.
For the best read performance when you use row-level deletes, perform regular compaction and table maintenance to remove old delete files. For information, see Maintain tables that use an external catalog.
Excessive position deletes, especially dangling position deletes, might prevent table creation and refresh operations. To avoid this issue, perform table maintenance to remove extra position deletes.
The table maintenance method to use depends on your external Iceberg engine. For example, you can use the
rewrite_data_filesmethod for Spark with thedelete-file-thresholdorrewrite-alloptions. For more information, see rewrite_data_files (https://iceberg.apache.org/docs/latest/spark-procedures/#rewrite_data_files) in the Apache Iceberg™ documentation.
文件格式
- Iceberg 表支持 Apache Parquet 文件。
- 不支持使用无符号整型逻辑类型的 Parquet 文件。
For Parquet files that use the
LISTlogical type, be aware of the following:
- The three-level annotation structure with the
elementkeyword is supported. For more information, see Parquet Logical Type Definitions (https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists). If your Parquet file uses an obsolete format with thearraykeyword, you must regenerate your data based on the supported format.
外部卷
- 不得使用存储集成访问外部卷中的云存储位置。
- 必须为创建的每个外部卷配置单独的信任关系。
- You can use outbound private connectivity to access Snowflake-managed Iceberg tables and Iceberg tables that use a catalog integration for object storage, but cannot use it to access Iceberg tables that use other catalog integrations.
After you create a Snowflake-managed table, the path to its files in external storage does not change, even if you rename the table.
Snowflake can’t support external volumes with S3 bucket names that contain dots (for example,
my.s3.bucket). S3 doesn’t support SSL for virtual-hosted-style buckets with dots in the name, and Snowflake uses virtual-host-style paths and HTTPS to access data in S3.
元数据文件
元数据文件无法识别 Iceberg 表的最新快照。
You can’t modify the location of the data files or snapshot using the ALTER ICEBERG TABLE command. To modify either of these settings, you must recreate the table (using the CREATE OR REPLACE ICEBERG TABLE syntax).
使用外部目录的表:
- Ensure that manifest files don’t contain duplicates. If duplicate files are present in the same snapshot, Snowflake returns an error that includes the path of the duplicate file.
- 如果 Parquet 元数据包含无效的 UTF-8 字符,则无法创建表。确保 Parquet 元数据符合 UTF-8 标准。
Snowflake detects corruptions and inconsistencies in Parquet metadata produced outside of Snowflake, and surfaces issues through error messages.
It’s possible to create, refresh, or query externally managed (or converted) tables, even if the table metadata is inconsistent. When writing Iceberg data, ensure that the table’s metadata statistics (for example,
RowCountorNullCount) match the data content.For tables that use Snowflake as the catalog, Snowflake processes DDL statements individually and produces metadata in a way that might differ from other catalogs. For more information, see DDL statements.
群集
Clustering support depends on the type of Iceberg table.
Table type Notes Tables that use Snowflake as the Iceberg catalog Set a clustering key by using either the CREATE ICEBERG TABLE or the ALTER ICEBERG TABLE command. To set or manage a clustering key, see CREATE ICEBERG TABLE (Snowflake as the Iceberg catalog) and ALTER ICEBERG TABLE. Tables that use an external catalog Clustering is not supported. Converted tables Snowflake only clusters files if they were created after converting the table, or if the files have since been modified using a DML statement.
Delta
- Snowflake supports minReaderVersion 3 and can read all tables written by engines that use the latest version of Delta Lake, which is 4.0.0. Delta Lake version 4.0.0 includes support for deletion vectors and liquid clustering.
- Snowflake streams aren’t supported for Iceberg tables created from Delta table files with partition columns. However, insert-only streams for tables created from Delta files without partition columns are supported.
- Iceberg tables created from Delta files that were created before the 2024_04 release bundle are not supported in dynamic tables.
- Snowflake doesn’t support creating Iceberg tables from Delta table definitions in the AWS Glue Data Catalog.
Parquet files (data files for Delta tables) that use any of the following features or data types aren’t supported:
- Field IDs.
- The INTERVAL data type.
- The DECIMAL data type with precision higher than 38.
- LIST or MAP types with one-level or two-level representation.
- Unsigned integer types (INT(signed = false)).
- The FLOAT16 data type.
You can use the Parquet physical type
int96for TIMESTAMP, but Snowflake doesn’t supportint96for TIMESTAMP_NTZ.
- For more information about Delta data types and Iceberg tables, see Delta data types.
Snowflake processes a maximum of 1000 Delta commit files each time you refresh a table using CREATE/ALTER … REFRESH. If your table has over 1000 commit files, you can do additional manual refreshes. Each time, the refresh process continues from where the last one stopped.
Note
Snowflake uses Delta checkpoint files when creating an Iceberg table. The 1,000 commit file limit only applies to commits after the latest checkpoint.
When you refresh an existing table, Snowflake processes Delta commit files, but not checkpoint files. If table maintenance removes stale log and data files for the source Delta table, you should refresh Delta-based Iceberg tables in Snowflake more frequently than the retention period of Delta logs and data files.
- The following Delta Lake features aren’t currently supported: Row Tracking, change data files, change metadata, DataChange, CDC, protocol evolution.
自动刷新
- For catalog integrations created before Snowflake version 8.22 (or 9.2 for Delta-based tables), you must manually set the
REFRESH_INTERVAL_SECONDSparameter before you enable automated refresh on tables that depend on that catalog integration. For instructions, see ALTER CATALOG INTEGRATION … SET AUTO_REFRESH.- For catalog integrations for object storage, automated refresh is only supported for integrations with
TABLE_FORMAT = DELTA.- For tables with frequent updates, using a shorter polling interval (
REFRESH_INTERVAL_SECONDS) can cause performance degradation.- Automated refresh synchronizes schema changes alongside DML operations such as INSERT, UPDATE, or DELETE. To apply schema changes made through DDL operations alone, perform a manual refresh.
目录链接数据库和自动表发现
Supported only when you use a catalog integration for Iceberg REST (for example, Snowflake Open Catalog).
To limit automatic table discovery to a specific set of namespaces, use the ALLOWED_NAMESPACES parameter. You can also use the BLOCKED_NAMESPACES parameter to block a set of namespaces.
Snowflake doesn’t sync remote catalog access control for users or roles.
You can create schemas, externally managed Iceberg tables, or database roles in a catalog-linked database. Creating other Snowflake objects isn’t currently supported.
When you create a catalog-linked database, you can’t specify the default Iceberg version or merge-on-read behavior to use for Iceberg tables.
However, you can modify these properties for an existing database by using the ALTER DATABASE (catalog-linked) command to set the following parameters:
- ICEBERG_VERSION_DEFAULT
- ENABLE_ICEBERG_MERGE_ON_READ
For Iceberg tables in a catalog-linked database:
Snowflake bidirectionally syncs table and column descriptions between the remote catalog and Snowflake. Sync can update a description to a new value, but never replaces a non-empty description with an empty one. Other remote catalog table properties, such as retention policies or buffers, aren’t copied, and altering table properties isn’t currently supported.
Automated refresh is enabled by default. If the
table-uuidof an external table and the catalog-linked database table don’t match, refresh fails and Snowflake drops the table from the catalog-linked database; Snowflake doesn’t change the remote table.If you drop a table from the remote catalog, Snowflake drops the table from the catalog-linked database. This action is asynchronous, so you might not see the change in the remote catalog right away.
If you rename a table in the remote catalog, Snowflake drops the existing table from the catalog-linked database and creates a table with the new name.
Masking policies and tags are supported. Other Snowflake-specific features, including replication and cloning, aren’t supported.
The character that you choose for the NAMESPACE_FLATTEN_DELIMITER parameter can’t appear in your remote namespaces. During the auto discovery process, Snowflake skips any namespace that contains the delimiter, and doesn’t create a corresponding schema in your catalog-linked database.
If you specify anything other than
_,$, or numbers for the NAMESPACE_FLATTEN_DELIMITER parameter, you must put the schema name in quotes when you query the table.For databases linked to AWS Glue, you must use lowercase letters and surround the schema, table, and column names in double quotes. This is also required for other Iceberg REST catalogs that only support lowercase identifiers.
The following example shows a valid query:
The following statements aren’t valid, because they use uppercase letters or omit the double quotes:
Using UNDROP ICEBERG TABLE isn’t supported.
Sharing:
- Sharing with a listing isn’t currently supported
- Direct sharing is supported
For writing to tables in a catalog-linked database:
- Creating tables in nested namespaces isn’t currently supported.
- Writing to tables in nested namespaces isn’t currently supported.
- Position row-level deletes (https://iceberg.apache.org/spec/#row-level-deletes) are supported for tables stored on Amazon S3, Azure, or Google Cloud. Row-level deletes with equality delete files aren’t supported. For more information about row-level deletes, see Use row-level deletes. To turn off position deletes, which enable running the Data Manipulation Language (DML) operations in copy-on-write mode, set the
ENABLE_ICEBERG_MERGE_ON_READparameter to FALSE at the table, schema, or database level.
外部托管写入支持
Snowflake supports externally managed writes for Iceberg tables that use version 2 of the Iceberg table specification (https://iceberg.apache.org/spec/).
Snowflake provides Data Definition Language (DDL) and Data Manipulation Language (DML) commands for externally managed tables. However, you configure metadata and data retention using your external catalog and the tools provided by your external storage provider. For more information, see Tables that use an external catalog.
For writes, Snowflake ensures that changes are committed to your remote catalog before updating the table in Snowflake.
If you use a catalog-linked database, you can use the CREATE ICEBERG TABLE syntax with column definitions to create a table in Snowflake and in your remote catalog. If you use a standard Snowflake database (not linked to a catalog), you must first create a table in your remote catalog. After that, you can use the CREATE ICEBERG TABLE (Iceberg REST catalog) syntax to create an Iceberg table in Snowflake and write to it.
For the AWS Glue Data Catalog: Dropping an externally managed table through Snowflake doesn’t delete the underlying table files. This behavior is specific to the AWS Glue Data Catalog implementation.
You can’t drop an Amazon S3 Table through Snowflake. The Amazon S3 Tables service requires the
purgeoption to be specified with the DROP command, which Snowflake doesn’t currently support.Position row-level deletes (https://iceberg.apache.org/spec/#row-level-deletes) are supported for tables stored on Amazon S3, Azure, or Google Cloud. Row-level deletes with equality delete files aren’t supported. For more information about row-level deletes, see Use row-level deletes. To turn off position deletes, which enable running the DML operations in copy-on-write mode, set the
ENABLE_ICEBERG_MERGE_ON_READparameter to FALSE at the table, schema, or database level.Writing to externally managed tables with the following Iceberg data types isn’t supported:
uuidfixed(L)The following features aren’t currently supported when you use Snowflake to write to externally managed Iceberg tables:
Server-side encryption (SSE) for Azure external volumes.
Multi-statement transactions. Snowflake supports autocommit transactions only.
Conversion to Snowflake-managed tables.
External Iceberg catalogs that don’t conform to the Iceberg REST protocol.
Using the OR REPLACE option when creating a table.
Using the CREATE ICEBERG TABLE (catalog-linked database) … AS SELECT syntax if you use one of the following catalogs as your remote catalog:
- AWS Glue
- Databricks Unity Catalog
Alternatively, you can use the CREATE ICEBERG TABLE (Iceberg REST catalog) syntax to create an empty Iceberg table and then use an INSERT INTO … SELECT statement to insert data into the empty table. However, this alternative uses two separate transactions, so it doesn’t guarantee atomicity.
For creating schemas in a catalog-linked database, be aware of the following:
- The CREATE SCHEMA command creates a corresponding namespace in your remote catalog only when you use a catalog-linked database.
- The ALTER and CLONE options aren’t supported.
- Delimiters aren’t supported for schema names. Only alphanumeric schema names are supported.
You can set a target file size for a table’s Parquet files. For more information, see Set a target file size.
For Azure cloud storage services: Snowflake only supports externally managed writes for Iceberg tables that use the following services for external storage:
Blob Storage
Data Lake Storage Gen2
Connecting Snowflake to Data Lake Storage Gen2 storage by using an external volume is in public preview. This configuration enables externally managed writes to catalogs that are only configured to use Data Lake Storage, such as Unity Catalog. For more information, see Configure an external volume for Azure
Note
Connecting Snowflake to Data Lake Storage Gen2 storage by using catalog-vended credentials isn’t supported.
General-purpose v1
General-purpose v2
Microsoft Fabric OneLake
Sharing:
- Sharing with a listing isn’t currently supported.
- Direct sharing isn’t currently supported.
第三方客户端对 Iceberg 数据、元数据的访问
- 第三方客户端无法向使用 Snowflake 作为目录的 Iceberg 表追加、删除或者更新或插入数据。
Table optimization
-
Snowflake doesn’t support orphan file deletion for Snowflake-managed Iceberg tables. If you see a mismatch between storage usage for your external cloud storage and Snowflake, you might have orphan files in your external cloud storage. To see your storage usage for Snowflake, you can use the TABLE_STORAGE_METRICS view or TABLE_STORAGE_METRICS view. If you see a mismatch, contact Snowflake Support for assistance with determining whether you have orphan files and removing them.
- For Snowflake-managed Iceberg tables, if a DML operation fails unexpectedly and rolls back, some Parquet files might get written to your external cloud storage but won’t be tracked or referenced by your Iceberg table metadata. These Parquet files are orphan files.
External query engines through Snowflake Horizon Catalog
This section lists the considerations for accessing, querying, and writing to Iceberg tables with an external query engine.
Consider the following items when you access Iceberg tables with an external query engine:
-
Iceberg
- For tables in Snowflake:
- Only Snowflake-managed Iceberg tables are supported.
- For tables in Snowflake:
-
Listings:
- Iceberg tables that you share through auto-fulfillment for listings aren’t accessible through the consumer account’s Horizon Iceberg REST Catalog API.
-
Network and private connectivity:
- Using network policies that are set at the user level isn’t supported with this feature.
- For Snowflake-managed network rules, egress IP addresses that are static aren’t supported.
- Explicitly granting the Horizon Catalog endpoint access to your storage accounts isn’t supported. We recommend that you use private connectivity for secure connectivity from external engines to Horizon Catalog and from Horizon Catalog to your storage account.
-
Clouds:
-
Commercial: This feature is only supported for Snowflake-managed Iceberg tables that are stored on Amazon S3, Google Cloud, or Microsoft Azure for all commercial cloud regions. S3-compatible non-AWS storage isn’t yet supported.
-
FedRAMP (Moderate): This feature is supported for Snowflake-managed Iceberg tables that are stored on FedRAMP (Moderate) deployments on AWS Commercial Gov (US) in the us-east-1 and us-west-2 regions.
-
For Iceberg tables stored on Amazon S3:
-
If you want to use SSE-KMS encryption, contact customer support or your account team for assistance with enabling access.
Note
Writing to KMS-encrypted external volumes is not supported.
-
-
For Iceberg tables stored on Azure:
- Azure Virtual Network (VNet) isn’t supported.
-
-
Authentication:
- For key-pair authentication, key-pair rotation isn’t supported.
- Workload identity federation isn’t supported with this feature.
Consider the following items when you query (read) Iceberg tables with an external query engine:
-
Iceberg
-
Querying the following tables isn’t supported:
- Remote tables
- Snowflake native tables
- Externally managed Iceberg tables including Delta-based Iceberg tables and Snowflake-managed Iceberg tables that you loaded with data from Iceberg-compatible Parquet data files by using the COPY INTO table command
-
Reading Iceberg v2 tables is supported.
-
Reading Iceberg V3 tables (public preview) is supported for the following capabilities:
- Variant data type
- Row lineage
All other Iceberg V3 capabilities, including default values and the geography data type, aren’t supported.
-
-
Access control:
-
Tables protected by the following fine-grained data policies can be accessed over Apache Spark™ through Snowflake Horizon Catalog:
- Masking policies
- Tag-based masking policies
- Row access policies
For more information, see Enforce data protection policies when querying Apache Iceberg™ tables from Apache Spark™.
-
-
Cloned and converted tables:
- Reading cloned or converted tables is not supported with vended credentials. To read these tables, use direct access to object storage.
Consider the following items when you write to Iceberg tables with an external query engine:
-
Table operations:
-
You can’t specify a base location with your CREATE TABLE statement.
When you create a Snowflake-managed table without specifying a base location, Snowflake constructs the following path for your table:
STORAGE_BASE_URL/database/schema/table_name.randomId/[data | metadata]/ -
CREATE TABLE AS SELECT (CTAS) from an external engine is not supported.
-
Equality deletes aren’t supported.
-
You can’t write to tables by using row-level deletes; only copy-on-write mode is supported.
-
Creating Iceberg tags and branches isn’t supported.
-
The external engine writes are supported only on Iceberg version 2; writing to Iceberg version 3 (v3) tables (public preview) is not currently supported.
-
Writing to KMS-encrypted external volumes is not supported.
-
Writing to dynamic tables in Snowflake isn’t supported.
-
Writing to shared Iceberg tables isn’t supported.
-
Registering Iceberg tables isn’t supported.
-
-
Maintenance operations
- You can’t roll back a table to a previous snapshot.
- The snapshot expiration operation isn’t supported.
- You can’t upgrade an Iceberg table from v2 to v3.
-
Cloned and converted tables:
- Writing to cloned or converted tables is not supported with vended credentials. To write to these tables, connect your external query engine directly to the object storage where your tables are stored.
- You can’t write to an Iceberg table that was converted from externally managed to Snowflake managed.
-
Streams:
- On Iceberg V2 tables, copy-on-write operations cause standard streams to represent an updated or relocated row as a DELETE record followed by an INSERT record for the same row.
-
Fine-grained access control policies:
- Writing to tables that have fine-grained access control policies or tags isn’t supported.
Native App Framework
You can share Iceberg tables with consumers through the Snowflake Native App Framework. Be aware of the following restrictions:
- Iceberg tables shared through a Native App are read-only for consumers.
- Cross-Cloud Auto-Fulfillment is not supported for apps that share Iceberg tables.
- Consumers must explicitly enable the
EXTERNAL_DATArestricted feature to the app before it can resolve Iceberg tables. For more information, see Request access to external and Apache Iceberg™ tables.
不支持的功能
所有 Iceberg 表目前不支持以下 Snowflake 功能:
- Collation
- Fail-safe
- Hybrid tables
- Snowflake 加密
- Snowflake schema evolution
- Tagging using the ASSOCIATE_SEMANTIC_CATEGORY_TAGS stored procedure
- Temporary and transient tables
外部托管的 Iceberg 表不支持以下功能:
- Cloning
- Clustering
- Standard and append-only streams. Insert-only streams are supported.
- Replication of Iceberg tables, external volumes, or catalog integrations