SnowflakeDetectDuplicate 2025.3.28.13-SNAPSHOT

BUNDLE

com.snowflake.openflow.runtime | runtime-snowflake-processors-nar

DESCRIPTION

Checks if a FlowFile’s hash (provided as a FlowFile attribute) is already in a Snowflake table, and routes the FlowFile to ‘duplicate’ if found, ‘distinct’ if not found, or ‘failure’ on errors.

TAGS

database, detect, duplicates, hash, snowflake

INPUT REQUIREMENT

REQUIRED

Supports Sensitive Dynamic Properties

false

PROPERTIES

Property

Description

Content Hash

The name of the FlowFile attribute that holds the pre-computed hash. Supports Expression Language.

Document Source Identifier

Specifies the document source identifier (doc ID). Supports Expression Language.

Document Source Name

Specifies the document source system name. Supports Expression Language.

Snowflake Connection Service

The DBCPService that provides connection to Snowflake.

Snowflake Table Name

The Snowflake table name that stores the file hashes. Database and schema must be configured prior in the Snowflake Connection Service.

RELATIONSHIPS

NAME

DESCRIPTION

distinct

FlowFiles that do not match an existing document are routed here (new hash inserted).

failure

FlowFiles that encounter an error or exception during processing are routed here.

duplicate

FlowFiles that match an existing document (same hash) are routed here.

WRITES ATTRIBUTES

NAME

DESCRIPTION

snowflake.detect.duplicate

A ‘true’ or ‘false’ attribute indicating if the FlowFile was detected as a duplicate.

Language: English