<model_name>!DETECT_ANOMALIES

Detects and reports anomalies in the input data passed to the method. This is a method of the anomaly detector object that you create by executing the command.

The method returns a table that labels each row of the input data as anomalous or not.

If you need to select specific columns from the data returned by this method, you can call the method in the FROM clause of a SELECT statement. See Selecting columns from SQL class instance methods that return tabular data.

Syntax

<model_name>!DETECT_ANOMALIES(
  INPUT_DATA => <reference_to_data_to_analyze>,
  TIMESTAMP_COLNAME => '<timestamp_column_name>',
  TARGET_COLNAME => '<target_column_name>',
  [ CONFIG_OBJECT => <configuration_object>, ]
  [ SERIES_COLNAME => '<series_column_name>' ]
)

Note

model_name is the object that you create by executing the command.

Arguments

Required:

INPUT_DATA => reference_to_data_to_analyze

A reference to the table, view, or query that returns the data to analyze.

To create this reference, you can use the TABLE keyword with the table name, view name, or query, or you can call the SYSTEM$REFERENCE or SYSTEM$QUERY_REFERENCE function.

TIMESTAMP_COLNAME => 'timestamp_column_name'

The name of the column containing the timestamps (TIMESTAMP_NTZ) in the time-series data.

TARGET_COLNAME => 'target_column_name'

The name of the column containing the data to analyze (type NUMERIC or FLOAT).

Optional:

SERIES_COLNAME => 'series_column_name'

Name of the column containing the identifier for the series (for multi-series data). This column should be a VARIANT because it can be any type of value or values from multiple columns in an array.

CONFIG_OBJECT => config_object

An OBJECT containing key-value pairs used to configure the anomaly detection job.

KeyTypeDefaultDescription
prediction_intervalFLOAT0.99

Value between 0 and 1 that specifies the percentage of the observations that should be marked as anomalies:

  • For less strict anomaly detection (that is, identifying fewer observations marked as anomalies), specify a higher value.
  • For more strict anomaly detection (that is, identifying more observations as anomalies), reduce this value.
on_errorSTRING'ABORT'

String (constant) that specifies the error handling for the anomaly detection task. This is most useful when detecting anomalies in multiple series. Supported values are:

  • 'abort': Abort the operation if an error is encountered in any time series.
  • 'skip': Skip any time series where anomaly detection encounters an error. This allows anomaly detection to succeed for other time series. Series that failed are absent from the output.

Returns

The function returns the following columns:

Column NameData TypeDescription
SERIESVARIANTSeries value (NULL if model was trained with single time series).
TSTIMESTAMP_NTZThe timestamps of the data
YFLOATThe values for the time series
FORECASTFLOATThe predicted value at the timestamp.
LOWER_BOUNDFLOATThe lower bound of the value within the prediction interval. Values that are lower than this are flagged as anomalies.
UPPER_BOUNDFLOATThe upper bound of the value within the prediction interval. Values that are higher than this are flagged as anomalies.
IS_ANOMALYBOOLEANTrue if the value is an anomaly; False if not.
PERCENTILEFLOAT

The corresponding percentile of the observed Y value given the prediction interval.

If the percentile is outside of ((1 - alpha) / 2, 1 - (1 - alpha) / 2), the value is flagged as an anomaly. For example, if the prediction interval is 0.95, a percentile of 0.96 would not be an anomaly, but a percentile of 0.98 would be.

If the prediction_interval field is not specified in the configuration object, the default is 0.99.

DISTANCEFLOATThe multiple of the standard deviation from the FORECAST column (z-score)

Usage notes

  • The columns for the data specified in the command (in the INPUT_DATA constructor argument) must match the columns for the data specified in the INPUT_DATA argument of this method.

    For example, if you passed the SERIES_COLNAME argument to the command, you must also pass the SERIES_COLNAME argument to this method. If you omitted the SERIES_COLNAME argument in the command, you must omit that argument here.

  • If the column names specified by the TIMESTAMP_COLNAME or TARGET_COLNAME arguments do not exist in the table, view, or query specified by the INPUT_DATA argument, an error occurs.