Snowflake 如何表示跟踪事件

Internally, Snowflake uses the OpenTelemetry (https://opentelemetry.io/) data model to represent trace events inside an object called a span. A span (https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#span) describes an operation, such as the invocation of a stored procedure or the execution of a UDF over a set of rows. A span includes the start time and end time of the operation.

Tip

For guidelines to keep in mind when adding trace events, see General guidelines for adding trace events.

Snowflake 如何发出跟踪事件

对于存储过程或 UDF,Snowflake 可以在调用时并行执行它,其中每个并行执行单元在一组不同的行上执行。发出的任何跟踪事件都限定到其执行单元,并包装在同一 span 内。

对于 Streamlit 应用程序,每个用户会话在单个 span 中获取。

跟踪事件仅在其执行单元完成后发出。如果执行单元在完成之前失败,则不保证会发出来自该执行单元的跟踪事件。

来自不同执行单元的跟踪事件存储在事件表的单独行中(即在不同的 span 中)。

Note

由于是按输入表行应用 UDFs,因此针对每个输入表行执行 UDF 中跟踪事件 APIs 的调用。在大多数情况下,不建议为每行添加跟踪事件。每个执行单元限制为 128 个事件。

示例:从 Java 过程发出事件

以下示例演示如何从处理程序代码发出事件。它还显示事件表如何存储生成的事件数据。

使用 Java 处理程序的存储过程

The Java code in the following example illustrates how you can add events to a span, along with attribute data. For more information about APIs for handler languages, see Event tracing from handler code.

CREATE OR REPLACE PROCEDURE test_stored_proc()
RETURNS STRING
LANGUAGE JAVA
RUNTIME_VERSION = '11'
PACKAGES=('com.snowflake:snowpark:latest', 'com.snowflake:telemetry:latest')
HANDLER = 'MyClass.run'
AS
$$
  import com.snowflake.snowpark_java.Session;
  import com.snowflake.telemetry.Telemetry;
  import io.opentelemetry.api.common.AttributeKey;
  import io.opentelemetry.api.common.Attributes;
  import io.opentelemetry.api.common.AttributesBuilder;

  public class MyClass {

    public String run(Session session) {
      // Adding an event without attributes.
      Telemetry.addEvent("testEvent");

      // Adding an event with attributes.
      Attributes eventAttributes = Attributes.of(
          AttributeKey.stringKey("key"), "run",
          AttributeKey.longKey("result"), Long.valueOf(123));
      Telemetry.addEvent("testEventWithAttributes", eventAttributes);

      // Setting span attributes of different types.
      Telemetry.setSpanAttribute("example.boolean", true);
      Telemetry.setSpanAttribute("example.long", 2L);
      Telemetry.setSpanAttribute("example.double", 2.5);
      Telemetry.setSpanAttribute("example.string", "testAttribute");

      return "SUCCESS";
    }
  }
$$;

记录的 span 数据

函数或过程成功执行后,Snowflake 会将 OpenTelemetry span 对象呈现为事件表列中的对象,如下表所示。

A span can have its own attributes. Since a span represents a stored procedure and UDF execution unit, you might find it useful to set span-level attributes for later data analysis. For more information about how to set span attributes, see the content specific to the language in which you’re writing handler code. For a list of these languages, see Event tracing from handler code.

一个 span 最多可以容纳 128 个跟踪事件,以及最多 128 个 span 属性。

  • 如果跟踪事件数超过限制,则根据处理程序语言,按如下方式删除事件:

    • 对于 Python 处理程序,按添加顺序(换句话说,按先进先出的顺序)删除事件。
    • 对于用 Java、JavaScript、Scala 和 Snowflake Scripting 编写的处理程序,一旦达到限制,就会删除新事件。
  • 如果 span 属性的数量超过限制,则无法再添加 span 属性。

Note

As of November 2022, all dropped_*_count keys are not set for JavaScript because the OpenTelemetry JavaScript Tracing SDK does not report on dropped counts.

DescriptionData
Span recorded by Snowflake for the execution of the procedure containing the handler code.
  • START_TIMESTAMP 列中的开始时间戳:

2023-03-21 23:12:06.231

  • TIMESTAMP 列中的完成时间戳:

2023-03-21 23:12:06.944

  • RECORD 列中的数据:
{
  "kind": "SPAN_KIND_INTERNAL",
  "name": "snow.auto_instrumented",
  "status": {
    "code": "STATUS_CODE_UNSET"
  }
}
Attributes added by handler code for the span.
  • RECORD_ATTRIBUTES 列中的数据:
{
  "example.boolean": true,
  "example.double": 2.5,
  "example.long": 2,
  "example.string": "testAttribute"
}

记录的事件数据

The span contains a list of trace events with timestamps that capture when the trace events were added. Not shown here: The span has a trace_id which is the query ID without dashes. The span also has system-generated values for the span_id and name keys. Events that are part of the span share the same span_id.

The following data was recorded for the event testEvent.

DescriptionData
Event name
  • TIMESTAMP 列中的时间戳:

2023-03-21 23:12:06.939

  • RECORD 列中的数据:
{
  "dropped_attributes_count": 0,
  "name": "testEvent"
}

The following data recorded for the event testEventWithAttributes.

DescriptionData
Event name
  • TIMESTAMP 列中的时间戳:

2023-03-21 23:12:06.940

  • RECORD 列中的数据:
{
  "dropped_attributes_count": 0,
  "name": "testEventWithAttributes"
}
Event attributes
  • RECORD_ATTRIBUTES 列中的数据:
{
  "key": "run",
  "result": 123
}