ParseTableImage 2025.3.28.13-SNAPSHOT¶
BUNDLE¶
com.snowflake.openflow.runtime | runtime-document-layout-nar
DESCRIPTION¶
Extracts the text from a Table image and writes it to the FlowFile content in csv format.
INPUT REQUIREMENT¶
REQUIRED
Supports Sensitive Dynamic Properties¶
false
PROPERTIES¶
Property |
Description |
---|---|
Communication Timeout |
The amount of time to wait for a response from the microservices before timing out. |
Custom Table Structure Recognition Service URL |
The Custom URL of the Openflow Table Structure Recognition Service. |
MIME Type |
The MIME Type of the image file. |
OCR Confidence Threshold |
The minimum confidence level required for a text block to be included in the output. Text blocks with a confidence level below this value will be excluded. |
OCR Service |
An OCR Service for reading files to output text. |
Service Location Strategy |
Determines how Service Locations are configured within this processor for the Table Structure Recognition Service. |
RELATIONSHIPS¶
NAME |
DESCRIPTION |
---|---|
table.not.found |
If the processor determines that an input FlowFile does not contain a table, the original FlowFile will be routed to this relationship. |
failure |
If a FlowFile cannot be convert into a CSV, the input FlowFile will be routed to this relationship. |
success |
When the table text has been successfully extracted, the CSV representation of the text will be routed to this relationship. |
comms.failure |
If the processor is unable to communicate with one of the necessary services, the input FlowFile will be routed to this relationship. |
WRITES ATTRIBUTES¶
NAME |
DESCRIPTION |
---|---|
filename |
The filename of the FlowFile. |
mime.type |
The MIME type of the FlowFile. |
table.text.json |
If the processor successfully extracts the table text, or if it is determined that the FlowFile does not contain a table, this attribute will be removed. |