在 Python 中发出跟踪事件

You can use the Snowflake telemetry package to emit trace events from a function or procedure handler written in Python. The package is available from the Anaconda Snowflake channel (https://repo.anaconda.com/pkgs/snowflake).

You can access stored trace event data by executing a SELECT command on the event table. For more information, see Viewing trace data.

Note

For guidelines to keep in mind when adding trace events, see General guidelines for adding trace events.

For general information about setting up logging and retrieving messages in Snowflake, see Logging messages from functions and procedures.

通过代码记录之前,您必须:

Note

For guidelines to keep in mind when adding trace events, see General guidelines for adding trace events.

添加对遥测包的支持

To use telemetry package, you must make the open source Snowflake telemetry package (https://github.com/snowflakedb/snowflake-telemetry-python), which is included with Snowflake, available to your handler code. The package is available from the Anaconda Snowflake channel (https://repo.anaconda.com/pkgs/snowflake).

By default, the telemetry package is included when you create a Python handler for a stored procedure or function. However, if you specify a package policy to allow or disallow specific packages explicitly, Snowflake doesn’t automatically include the snowflake-telemetry-python package. In this case, you must specify the package in the PACKAGES clause.

  • For a Streamlit app. You can add the snowflake-telemetry-python package to your app by using Snowsight or an environment.yml. file.

    Code in the following example uses the PACKAGES clause to reference the telemetry package as well as the Snowpark library (which is required for stored procedures written in Python – for more information, see Writing stored procedures with SQL and Python).

    CREATE OR REPLACE FUNCTION my_function(...)
      RETURNS ...
      LANGUAGE PYTHON
      ...
      PACKAGES = ('snowflake-telemetry-python')
      ...
  • Import the telemetry package in your code.

    from snowflake import telemetry

添加跟踪事件

You can add trace events by calling the telemetry.add_event method, passing a name for the event. You can also optionally associate attributes – key-value pairs – with an event.

The add_event method is available in the following form:

telemetry.add_event(<name>, <attributes>)

其中

Handler code in the following example adds two events, FunctionEmptyEvent and FunctionEventWithAttributes. With FunctionEventWithAttributes, the code also adds two attributes: key1 and key2.

telemetry.add_event("FunctionEmptyEvent")
telemetry.add_event("FunctionEventWithAttributes", {"key1": "value1", "key2": "value2"})

添加这些事件会在事件表中添加两行,每行在列中 RECORD 具有不同的值:

{
  "name": "FunctionEmptyEvent"
}
{
  "name": "FunctionEventWithAttributes"
}

The FunctionEventWithAttributes event row includes the following attributes in the row’s RECORD_ATTRIBUTES column:

{
  "key1": "value1",
  "key2": "value2"
}

添加 span 属性

You can set attributes – key-value pairs – associated with spans by calling the telemetry.set_span_attribute method.

For details on spans, see How Snowflake represents trace events.

The set_span_attribute method is available in the following form:

telemetry.set_span_attribute(<key>, <value>)

其中:

以下示例中的代码创建四个属性并设置其值:

// Setting span attributes.
telemetry.set_span_attribute("example.boolean", true);
telemetry.set_span_attribute("example.long", 2);
telemetry.set_span_attribute("example.double", 2.5);
telemetry.set_span_attribute("example.string", "testAttribute");

设置这些属性会导致事件表的 RECORD_ATTRIBUTES 列中出现以下内容:

{
  "example.boolean": true,
  "example.long": 2,
  "example.double": 2.5,
  "example.string": "testAttribute"
}

添加自定义 span

You can add custom spans that are separate from the default span created by Snowflake. For details on custom spans, see Adding custom spans to a trace.

Code in the following example uses the OpenTelemetry Python API (https://opentelemetry-python.readthedocs.io/en/latest/api/index.html) to create the my.span span as the current span with start_as_current_span. It then adds an event with attributes to the new span using the OpenTelemetry Python API (https://opentelemetry-python.readthedocs.io/en/latest/api/index.html).

Event data won’t be captured by the event table unless the span ends before your handler completes execution. In this example, closing the span happens automatically when the with statement concludes.

CREATE OR REPLACE FUNCTION customSpansPythonExample() RETURNS STRING
  LANGUAGE PYTHON
  RUNTIME_VERSION = 3.12
  PACKAGES = ('opentelemetry-api')
  HANDLER = 'custom_spans_function'
  AS $$
  from snowflake import telemetry
  from opentelemetry import trace

  def custom_spans_function():
    tracer = trace.get_tracer("my.tracer")
    with tracer.start_as_current_span("my.span") as span:
      span.add_event("Event2 in custom span", {"key1": "value1", "key2": "value2"})

    return "success"
  $$;

Python 示例

以下各节提供了从 Python 代码中为跟踪事件添加支持的示例。

存储过程示例

CREATE OR REPLACE PROCEDURE do_tracing()
  RETURNS VARIANT
  LANGUAGE PYTHON
  PACKAGES = ('snowflake-snowpark-python', 'snowflake-telemetry-python')
  RUNTIME_VERSION = 3.12
  HANDLER = 'run'
  AS $$
  from snowflake import telemetry
  def run(session):
    telemetry.set_span_attribute("example.proc.do_tracing", "begin")
    telemetry.add_event("event_with_attributes", {"example.key1": "value1", "example.key2": "value2"})
    return "SUCCESS"
  $$;

Streamlit 示例

import streamlit as st
from snowflake import telemetry

st.title("Streamlit trace event example")

hifives_val = st.slider("Number of high-fives", min_value=0, max_value=90, value=60)

if st.button("Submit"):
    telemetry.add_event("new_submission", {"high_fives": hifives_val})

UDF 示例

CREATE OR REPLACE FUNCTION times_two(x NUMBER)
  RETURNS NUMBER
  LANGUAGE PYTHON
  RUNTIME_VERSION = 3.12
  HANDLER = 'times_two'
AS $$
from snowflake import telemetry
def times_two(x):
  telemetry.set_span_attribute("example.func.times_two", "begin")
  telemetry.add_event("event_without_attributes")
  telemetry.add_event("event_with_attributes", {"example.key1": "value1", "example.key2": "value2"})

  response = 2 * x

  telemetry.set_span_attribute("example.func.times_two.response", response)

  return response
$$;

当您通过处理输入行的 Python 函数调用跟踪事件 API 时,将为 UDF 处理的 每一行 调用该 API。

例如,以下语句对 50 行调用上一示例中定义的 Python 函数,从而产生 100 个跟踪事件(每行两个):

select count(times_two(seq8())) from table(generator(rowcount => 50));

UDTF 示例

CREATE OR REPLACE FUNCTION digits_of_number(input NUMBER)
  RETURNS TABLE(result NUMBER)
  LANGUAGE PYTHON
  RUNTIME_VERSION = 3.12
  HANDLER = 'TableFunctionHandler'
  AS
$$
from snowflake import telemetry

class TableFunctionHandler:

  def __init__(self):
    telemetry.add_event("test_udtf_init")

  def process(self, input):
    telemetry.add_event("test_udtf_process", {"input": str(input)})
    response = input

    while input > 0:
      response = input % 10
      input /= 10
      yield (response,)

  def end_partition(self):
    telemetry.add_event("test_udtf_end_partition")
$$;

When you call the trace event API in the process() method of a UDTF handler class, the API will be called for every row processed.

For example, the following statement calls the process() method defined in the previous example for 50 rows, resulting in 100 trace events (two for each row) added by the process() method:

select * from table(generator(rowcount => 50)), table(digits_of_number(seq8())) order by 1;