The PIPE object

The PIPE object is the server-side processing layer for Snowpipe Streaming. Every streaming ingestion flows through a pipe, which handles schema validation, optional in-flight data transformations, and optional pre-clustering before committing data to the target table.

The PIPE object provides the following capabilities:

  • In-flight transformations: Filter rows, reorder columns, cast types, and apply expressions during ingestion by using COPY command transformation syntax. This enables data cleansing and reshaping at ingest time, with no separate ETL step required.

  • Pre-clustering: Sort data during ingestion based on table clustering keys for optimized query performance.

  • Server-side schema validation: Validate incoming data against the schema defined in the pipe before committing.

  • Table feature support: Ingest into tables with defined clustering keys, DEFAULT value columns, and AUTOINCREMENT (or IDENTITY) columns.

    PIPE object for Snowpipe Streaming

For quick setup, Snowflake automatically creates a default pipe for every table. The default pipe handles ingestion with no manual DDL required. For advanced use cases that require transformations or pre-clustering, you can create a custom named pipe. For more information, see CREATE PIPE.

Default pipe

Snowflake provides a default pipe for every target table. The default pipe is created on demand after the first successful pipe-info or open-channel call is made against the target table. This lets you start streaming data immediately without needing to manually execute CREATE PIPE DDL statements.

The default pipe has the following limitations:

  • No transformations: The default pipe uses MATCH_BY_COLUMN_NAME in the underlying copy statement. It doesn’t support specific data transformations.

  • No pre-clustering: The default pipe doesn’t support pre-clustering for the target table.

If your workflow requires transformations or pre-clustering, create your own named pipe. For more information, see CREATE PIPE.

When you configure the Snowpipe Streaming SDK or REST API, you can reference the default pipe name in your client configuration to begin streaming. For more information, see Tutorial: Get started with Snowpipe Streaming high-performance architecture SDK and Tutorial: Get started with Snowpipe Streaming REST API using cURL and a JWT.

Pre-clustering data during ingestion

Snowpipe Streaming can cluster in-flight data during ingestion, which improves query performance on your target tables. This feature sorts your data directly during ingestion before the data is committed.

To use pre-clustering, your target table must have clustering keys defined. You can then enable this feature by setting the parameter CLUSTER_AT_INGEST_TIME to TRUE in your COPY INTO statement when creating or replacing your Snowpipe Streaming pipe.

For more information, see CLUSTER_AT_INGEST_TIME.

Important

When you use the pre-clustering feature, don’t disable the auto-clustering feature on the destination table. Disabling auto-clustering can lead to degraded query performance over time.