Error handling in high-performance Snowpipe Streaming¶
This topic outlines the error handling mechanisms available in the high-performance edition of Snowpipe Streaming. This enhanced approach provides detailed error information and improves the overall error handling process for a more robust and informative experience.
Key error handling features in the high-performance architecture¶
Enhanced channel status endpoint: This edition extends the channel status endpoint to provide more comprehensive error information.
Granular error details: The high-performance edition provides more detailed error information to help identify where it occurred and find the root causes of ingestion issues.
Improved client experience: The high-performance edition simplifies error handling for clients, reducing the complexity of error reasoning and recovery.
The channel history view: SNOWPIPE_STREAMING_CHANNEL_HISTORY view provides a historical record of channel activity to monitor and locate errors. This feature enables you to track error trends and proactively address potential issues.
Channel status endpoint details¶
The high-performance architecture includes a channel status endpoint to provide more detailed, point-in-time information about a channel.
In addition to the channel status information for the classic architecture, which is statusCode
, persistedOffsetToken
, the high-performance architecture includes the following information:
channel_status_code
: Represents the current operational status of the streaming channel. This code provides a high-level indication of the channel’s health and ability to ingest data. For more information about the channel status codes, see Client-side error handling and required actions.last_commited_offset_token
: Indicates the offset token of the last row set that was successfully committed to the target table by Snowflake. This is crucial for tracking progress and ensuring data delivery.created_on_ms
: The timestamp, in milliseconds, that indicates when the streaming channel was initially created within Snowflake.database_name
: The name of the database to which the streaming channel is configured to ingest data.schema_name
: The name of the schema within the specified database where the target table for the streaming channel resides.pipe_name
: The name of the Snowpipe object that is configured to utilize this Snowpipe Streaming channel for data ingestion into a specific target table.channel_name
: A user-created name for the specific Snowpipe Streaming channel instance.rows_inserted
: A count of the total number of data rows that have been successfully inserted into the target table through this streaming channel since its creation.rows_parsed
: A count of the total number of data rows that have been processed and parsed by the Snowpipe Streaming service for this channel. (but not necessarily inserted, for example, due to errors).rows_error_count
: A count of the total number of data rows that encountered errors during processing and were therefore rejected by the Snowpipe Streaming service for this channel.last_error_offset_upper_bound
: The upper bound of the offset token range of the last rowset that contained errors. This helps in identifying the approximate location of the most recent errors within the data stream.last_error_message
: A human-readable message corresponding to the latest error code.last_error_timestamp
: The timestamp indicating when the most recent error occurred on this streaming channel.snowflake_avg_processing_latency_ms
: The average latency, in milliseconds, observed by the Snowflake service in processing rowsets received by this channel. This metric provides insight into the performance of the ingestion pipeline within Snowflake.
Error-handling flow in the high-performance architecture¶
Client sends data: The client application uses the Snowpipe Streaming SDK to send data to Snowflake via the
appendRow(s)
API.Server processing: The Snowflake service processes the data. This involves:
Buffering the data.
Parsing and validating the data.
Committing the data to the table.
Error detection: Errors can occur during any of the server-side processing stages.
Error recording: Snowflake records detailed information about the last occurred error, including the following information:
The upper bound of the offset token range of the last rowset that contained errors. This helps in identifying the approximate location of the most recent errors within the data stream.
An error message.
A timestamp.
Error reporting:
The enhanced channel status endpoint provides access to the recorded error information.
Clients can query this endpoint to retrieve last occurred error details.
SNOWPIPE_STREAMING_CHANNEL_HISTORY view provides a historical record of errors and their offsets.
Client action: The client application uses the error information to perform the following actions:
Identify the cause of the error.
Implement appropriate error handling logic, such as the following actions:
Retrying the failed operation.
Logging the error.
Alerting an administrator.
Moving the erroneous data to a dead-letter queue.
Reopening channels.
Client-side error handling and required actions¶
The Snowpipe Streaming SDK simplifies error handling by implementing internal retry logic for transient errors. However, for fatal channel errors and persistent authorization issues, you are required to take manual action.
SDK retry logic for transient errors¶
The SDK automatically retries the request to send unflushed data in the channel to the server for the following HTTP status codes, as they typically indicate a temporary or transient service issue:
5XX (Server errors)
429 (Too many requests)
408 (Request timeout)
Channel errors that require a manual reopen¶
The Snowpipe Streaming SDK doesn’t automatically reopen the channel. When a channel enters a state that isn’t valid, the customer must explicitly close and reopen the channel to continue ingestion.
A channel is considered not valid —– and requires client action — if the channel_status_code in the channel status response is anything other than SUCCESS
.
The following table shows persisted error codes that indicate a fatal channel state and require the channel to be reopened:
Error code |
Context |
Required client action |
---|---|---|
ERR_PIPE_DOES_NOT_EXIST_OR_NOT_AUTHORIZED |
The target pipe is missing or inaccessible. |
Fix the pipe issue. Reopen channel. |
ERR_TABLE_DOES_NOT_EXIST_NOT_AUTHORIZED |
The target table is missing or inaccessible. |
Fix the table issue. Reopen channel. |
ERR_CHANNEL_HAS_INVALID_ROW_SEQUENCER |
Row sequencing state isn’t valid. |
Reopen channel. |
ERR_CHANNEL_HAS_INVALID_CLIENT_SEQUENCER |
Channel sequencing state isn’t valid. |
Reopen channel. |
ERR_CHANNEL_MUST_BE_REOPENED |
A general error indicating the channel is unusable. |
Reopen channel. |
ERR_CHANNEL_MUST_BE_REOPENED_DUE_TO_ROW_SEQ_GAP |
A gap in the row sequence was detected. |
Reopen channel. |