高性能 Snowpipe Streaming 中的错误处理

本主题概述了高性能版本的 Snowpipe Streaming 中可用的错误处理机制。这种增强的方法提供了详细的错误信息,并改进了整体错误处理流程,从而提供了更强大、内容更丰富的体验。

高性能架构中的主要错误处理功能

  • 增强了通道状态端点:此版本扩展了通道状态端点,以提供更全面的错误信息。

  • 细化错误详情:高性能版本提供了更详细的错误信息,有助于确定错误发生的位置,并找出引入问题的根本原因。

  • 改善了客户端体验:高性能版本简化了客户端的错误处理,降低了错误推理和恢复的复杂性。

  • The channel history view: SNOWPIPE_STREAMING_CHANNEL_HISTORY 视图 provides a historical record of channel activity to monitor and locate errors. This feature enables you to track error trends and proactively address potential issues.

通道状态端点详情

The high-performance architecture includes a channel status endpoint to provide more detailed, point-in-time information about a channel.

In addition to the channel status information for the classic architecture, which is statusCode, persistedOffsetToken, the high-performance architecture includes the following information:

  • channel_status_code: Represents the current operational status of the streaming channel. This code provides a high-level indication of the channel's health and ability to ingest data. For more information about the channel status codes, see Client-side error handling and required actions.

  • last_commited_offset_token:表示 Snowflake 成功提交到目标表的最后行集的偏移量标记。这对于跟踪进度和确保数据交付至关重要。

  • created_on_ms: The timestamp, in milliseconds, that indicates when the streaming channel was initially created within Snowflake.

  • database_name:流式通道被配置为引入数据的数据库的名称。

  • schema_name:流式通道目标表所在的指定数据库中架构的名称。

  • pipe_name:Snowpipe 对象的名称,该对象被配置为利用此 Snowpipe Streaming 通道将数据引入到特定目标表中。

  • channel_name:用户为特定 Snowpipe Streaming 通道实例创建的名称。

  • rows_inserted:自该流式通道创建以来,通过该流式通道成功插入到目标表中的数据行总数统计。

  • rows_parsed:Snowpipe Streaming 服务为此通道处理和解析的数据行总数统计。(但不一定要插入,例如,由于错误)。

  • rows_error_count:在处理过程中遇到错误并因此被 Snowpipe Streaming 服务拒绝的该通道的数据行总数统计。

  • last_error_offset_upper_bound:最后一个包含错误的行集的偏移标记范围的上限。这有助于确定数据流中最近错误的大致位置。

  • last_error_message:与最新错误代码对应的可读消息。

  • last_error_timestamp:表示该流式通道中最近一次出现错误的时间戳。

  • snowflake_avg_processing_latency_ms:Snowflake 服务在处理该通道接收到的行集时观察到的平均延迟(以毫秒为单位)。通过该指标可深入了解 Snowflake 中引入管道的性能。

高性能架构中的错误处理流程

  • 客户端发送数据:客户端应用程序使用 Snowpipe Streaming SDK 通过 appendRow(s) API 向 Snowflake 发送数据。

  • 服务器处理:Snowflake 服务处理数据。这涉及:

    • 缓冲数据。

    • 解析和验证数据。

    • 将数据提交到表中。

  • 错误检测:在服务端处理的任何阶段都可能发生错误。

  • Error recording: Snowflake records detailed information about the last occurred error, including the following information:

    • 最后一个包含错误的行集的偏移标记范围的上限。这有助于确定数据流中最近错误的大致位置。

    • 错误消息。

    • 时间戳。

  • 错误报告:

    • 增强的通道状态端点提供了对记录的错误信息的访问权限。

    • 客户端可以查询此端点以检索上次发生的错误的详细信息。

    • SNOWPIPE_STREAMING_CHANNEL_HISTORY 视图 provides a historical record of errors and their offsets.

  • Client action: The client application uses the error information to perform the following actions:

    • 确定错误的原因。

    • Implement appropriate error handling logic, such as the following actions:

      • 重试失败的操作。

      • 记录错误。

      • 提醒管理员。

      • 将错误的数据移至死信队列。

      • 重新开放通道。

Client-side error handling and required actions

The Snowpipe Streaming SDK simplifies error handling by implementing internal retry logic for transient errors. However, for fatal channel errors and persistent authorization issues, you are required to take manual action.

SDK retry logic for transient errors

The SDK automatically retries the request to send unflushed data in the channel to the server for the following HTTP status codes, as they typically indicate a temporary or transient service issue:

  • 5XX (Server errors)

  • 429 (Too many requests)

  • 408 (Request timeout)

Channel errors that require a manual reopen

The Snowpipe Streaming SDK doesn't automatically reopen the channel. When a channel enters a state that isn't valid, the customer must explicitly close and reopen the channel to continue ingestion.

A channel is considered not valid —-- and requires client action --- if the channel_status_code in the channel status response is anything other than SUCCESS.

The following table shows persisted error codes that indicate a fatal channel state and require the channel to be reopened:

Error code

Context

Required client action

ERR_PIPE_DOES_NOT_EXIST_OR_NOT_AUTHORIZED

The target pipe is missing or inaccessible.

Fix the pipe issue. Reopen channel.

ERR_TABLE_DOES_NOT_EXIST_NOT_AUTHORIZED

The target table is missing or inaccessible.

Fix the table issue. Reopen channel.

ERR_CHANNEL_HAS_INVALID_ROW_SEQUENCER

Row sequencing state isn't valid.

Reopen channel.

ERR_CHANNEL_HAS_INVALID_CLIENT_SEQUENCER

Channel sequencing state isn't valid.

Reopen channel.

ERR_CHANNEL_MUST_BE_REOPENED

A general error indicating the channel is unusable.

Reopen channel.

ERR_CHANNEL_MUST_BE_REOPENED_DUE_TO_ROW_SEQ_GAP

A gap in the row sequence was detected.

Reopen channel.

Authorization errors that require configuration fix

When an ingestion attempt results in an HTTP authorization error, the customer must correct the underlying permission or credential issue. Don't reopen the channel for these errors because the new channel will immediately encounter the same problem.

  • 401 (Unauthorized)

  • 403 (Forbidden)

For these errors, the ingestion should be stopped, and the client application's security configuration --- for example, pipe permissions, user role, authentication credentials --- must be fixed before ingestion can resume. After fixing the authorization issue, you can reopen the client to continue ingestion.

语言: 中文