MergeRecord 2.3.0¶
BUNDLE¶
org.apache.nifi | nifi-standard-nar
DESCRIPTION¶
This Processor merges together multiple record-oriented FlowFiles into a single FlowFile that contains all of the Records of the input FlowFiles. This Processor works by creating ‘bins’ and then adding FlowFiles to these bins until they are full. Once a bin is full, all of the FlowFiles will be combined into a single output FlowFile, and that FlowFile will be routed to the ‘merged’ Relationship. A bin will consist of potentially many ‘like FlowFiles’. In order for two FlowFiles to be considered ‘like FlowFiles’, they must have the same Schema (as identified by the Record Reader) and, if the <Correlation Attribute Name> property is set, the same value for the specified attribute. See Processor Usage and Additional Details for more information. NOTE: this processor should NOT be configured with Cron Driven for the Scheduling Strategy.
INPUT REQUIREMENT¶
REQUIRED
Supports Sensitive Dynamic Properties¶
false
PROPERTIES¶
Property |
Description |
---|---|
Attribute Strategy |
Determines which FlowFile attributes should be added to the bundle. If ‘Keep All Unique Attributes’ is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If ‘Keep Only Common Attributes’ is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved. |
correlation-attribute-name |
If specified, two FlowFiles will be binned together only if they have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue. |
max-bin-age |
The maximum age of a Bin that will trigger a Bin to be complete. Expected format is <duration> <time unit> where <duration> is a positive integer and time unit is one of seconds, minutes, hours |
max-bin-size |
The maximum size for the bundle. If not specified, there is no maximum. This is a ‘soft limit’ in that if a FlowFile is added to a bin, all records in that FlowFile will be added, so this limit may be exceeded by up to the number of bytes in last input FlowFile. |
max-records |
The maximum number of Records to include in a bin. This is a ‘soft limit’ in that if a FlowFIle is added to a bin, all records in that FlowFile will be added, so this limit may be exceeded by up to the number of records in the last input FlowFile. |
max.bin.count |
Specifies the maximum number of bins that can be held in memory at any one time. This number should not be smaller than the maximum number of concurrent threads for this Processor, or the bins that are created will often consist only of a single incoming FlowFile. |
merge-strategy |
Specifies the algorithm used to merge records. The ‘Defragment’ algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The ‘Bin-Packing Algorithm’ generates a FlowFile populated by arbitrarily chosen FlowFiles |
min-bin-size |
The minimum size of for the bin |
min-records |
The minimum number of records to include in a bin |
record-reader |
Specifies the Controller Service to use for reading incoming data |
record-writer |
Specifies the Controller Service to use for writing out the records |
RELATIONSHIPS¶
NAME |
DESCRIPTION |
---|---|
failure |
If the bundle cannot be created, all FlowFiles that would have been used to created the bundle will be transferred to failure |
merged |
The FlowFile containing the merged records |
original |
The FlowFiles that were used to create the bundle |
WRITES ATTRIBUTES¶
NAME |
DESCRIPTION |
---|---|
record.count |
The merged FlowFile will have a ‘record.count’ attribute indicating the number of records that were written to the FlowFile. |
mime.type |
The MIME Type indicated by the Record Writer |
merge.count |
The number of FlowFiles that were merged into this bundle |
merge.bin.age |
The age of the bin, in milliseconds, when it was merged and output. Effectively this is the greatest amount of time that any FlowFile in this bundle remained waiting in this processor before it was output |
merge.uuid |
UUID of the merged FlowFile that will be added to the original FlowFiles attributes |
merge.completion.reason |
This processor allows for several thresholds to be configured for merging FlowFiles. This attribute indicates which of the Thresholds resulted in the FlowFiles being merged. For an explanation of each of the possible values and their meanings, see the Processor’s Usage / documentation and see the ‘Additional Details’ page. |
<Attributes from Record Writer> |
Any Attribute that the configured Record Writer returns will be added to the FlowFile. |
USE CASES¶
Combine together many arbitrary Records in order to create a single, larger file |
Use Cases Involving Other Components¶
Combine together many Records that have the same value for a particular field in the data, in order to create a single, larger file |