Manage Apache Iceberg™ tables¶
Manage Apache Iceberg™ tables in Snowflake:
You can also convert an Iceberg table that uses an external catalog into a table that uses Snowflake as the Iceberg catalog. To learn more, see Convert an Apache Iceberg™ table to use Snowflake as the catalog.
Query a table¶
To query an Iceberg table, a user must be granted or inherit the following privileges:
The USAGE privilege on the database and schema that contain the table
The SELECT privilege on the table
You can query an Iceberg table using a SELECT statement. For example:
SELECT col1, col2 FROM my_iceberg_table;
Use DML commands with Snowflake-managed tables¶
Iceberg tables that use Snowflake as the catalog support full Data Manipulation Language (DML) commands, including the following:
Note
Snowflake-managed tables also support efficient bulk loading using features such as COPY INTO <table> and Snowpipe. For more information, see Load data into Apache Iceberg™ tables.
Example: Update a table¶
You can use INSERT and UPDATE statements to modify Snowflake-managed Iceberg tables.
The following example inserts a new value into an Iceberg table named store_sales
,
then updates the cola
column to 1 if the value is currently -99.
INSERT INTO store_sales VALUES (-99);
UPDATE store_sales
SET cola = 1
WHERE cola = -99;
Generate snapshots of DML changes¶
For tables that use Snowflake as the catalog, Snowflake automatically generates the Iceberg metadata. Snowflake writes
the metadata to a folder named metadata
on your external volume in the location specified by the BASE_LOCATION
parameter
when the table was created.
Alternatively, you can call the SYSTEM$GET_ICEBERG_TABLE_INFORMATION function to generate Iceberg metadata for any new changes.
For tables that aren’t managed by Snowflake, the function returns information about the latest refreshed snapshot.
For example:
SELECT SYSTEM$GET_ICEBERG_TABLE_INFORMATION('db1.schema1.it1');
Output:
+-----------------------------------------------------------------------------------------------------------+
| SYSTEM$GET_ICEBERG_TABLE_INFORMATION('DB1.SCHEMA1.IT1') |
|-----------------------------------------------------------------------------------------------------------|
| {"metadataLocation":"s3://mybucket/metadata/v1.metadata.json","status":"success"} |
+-----------------------------------------------------------------------------------------------------------+
Maintain tables that use an external catalog¶
You can perform maintenance operations on Iceberg tables that use an external catalog.
Maintenance operations include the following:
Expiring snapshots
Removing old metadata files
Compacting data files
Important
To keep your Iceberg table in sync with external changes, it’s important to align your Snowflake refresh schedule with table maintenance. Refresh the table each time you perform a maintenance operation.
To learn about maintenance for Iceberg tables that aren’t managed by Snowflake, see Maintenance (https://iceberg.apache.org/docs/latest/maintenance/) in the Apache Iceberg documentation.
Refresh the table metadata¶
When you use an external Iceberg catalog, you can refresh the table metadata using the ALTER ICEBERG TABLE … REFRESH command. Refreshing the table metadata synchronizes the metadata with the most recent table changes.
Refresh the metadata for a table¶
The following example manually refreshes the metadata for a table that uses an external catalog (for example, AWS Glue or Delta). Refreshing the table keeps the table in sync with any changes that have occurred in the remote catalog.
With this type of Iceberg table, you don’t specify a metadata file path in the command.
ALTER ICEBERG TABLE my_iceberg_table REFRESH;
Refresh the metadata for a table created from Iceberg files¶
The following example manually refreshes a table created from Iceberg metadata files in an external cloud storage location,
specifying the relative path to a metadata file without the leading forward slash (/
).
The metadata file defines the data in the table after refreshing.
ALTER ICEBERG TABLE my_iceberg_table REFRESH 'metadata/v1.metadata.json';
Retrieve storage metrics¶
Snowflake does not bill your account for Iceberg table storage costs. However, you can track how much storage an Iceberg table occupies by querying the TABLE_STORAGE_METRICS and TABLES views in the Snowflake Information Schema or Account Usage schema.
The following example query joins the ACCOUNT_USAGE.TABLE_STORAGE_METRICS view with the ACCOUNT_USAGE.TABLES view, filtering on the TABLES.IS_ICEBERG column.
SELECT metrics.* FROM
snowflake.account_usage.table_storage_metrics metrics
INNER JOIN snowflake.account_usage.tables tables
ON (
metrics.id = tables.table_id
AND metrics.table_schema_id = tables.table_schema_id
AND metrics.table_catalog_id = tables.table_catalog_id
)
WHERE tables.is_iceberg='YES';