ListDropbox 2025.10.2.19¶
捆绑包¶
org.apache.nifi | nifi-dropbox-processors-nar
描述¶
从 Dropbox 检索文件列表(忽略快捷方式)。列出的每个文件都有可能生成一个 FlowFile,元数据作为 FlowFile 属性写入。如果设置了“Record Writer”属性,则整个结果将作为记录写入到单个 FlowFile 中。该处理器仅在集群中的主节点上运行。如果主节点发生变化,新的主节点将从前一个节点中断的地方继续,而不是复制所有数据。
输入要求¶
FORBIDDEN
支持敏感的动态属性¶
false
属性¶
属性 |
描述 |
|---|---|
Dropbox Credential Service |
用于获取 Dropbox 凭据(应用程序键、应用程序密钥、访问令牌、刷新令牌)的控制器服务。有关更多信息,请参阅控制器服务的“其他详细信息”。 |
Folder |
The Dropbox identifier or path of the folder from which to pull list of files. 'Folder'should match the following regular expression pattern: /.*|id:.* . Example for folder identifier: id:odTlUvbpIEAAAAAAAAAGGQ. Example for folder path: /Team1/Task1. |
Minimum File Age |
文件必须达到最短存在时间才纳入考虑;任何早于该时间的文件都将被忽略。 |
Search Recursively |
指示是否列出 Dropbox 文件夹的子文件夹中的文件。 |
et-initial-listing-target |
Specify how initial listing should be handled. Used by 'Tracking Entities'strategy. |
et-state-cache |
Listed entities are stored in the specified cache storage so that this processor can resume listing across NiFi restart or in case of primary node change. 'Tracking Entities'strategy require tracking information of all listed entities within the last 'Tracking Time Window'. To support large number of entities, the strategy uses DistributedMapCache instead of managed state. Cache key format is 'ListedEntities::{processorId}(::{nodeId})'. If it tracks per node listed entities, then the optional '::{nodeId}' part is added to manage state separately. E.g. cluster wide cache key ='ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b', per node cache key ='ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b::nifi-node3' The stored cache content is Gzipped JSON string. The cache key will be deleted when target listing configuration is changed. Used by 'Tracking Entities'strategy. |
et-time-window |
Specify how long this processor should track already-listed entities. 'Tracking Entities'strategy can pick any entity whose timestamp is inside the specified time window. For example, if set to '30 minutes', any entity having timestamp in recent 30 minutes will be the listing target when this processor runs. A listed entity is considered 'new/updated' and a FlowFile is emitted if one of following condition meets: 1. does not exist in the already-listed entities, 2. has newer timestamp than the cached entity, 3. has different size than the cached entity. If a cached entity 's timestamp becomes older than specified time window, that entity will be removed from the cached already-listed entities. Used by'Tracking Entities'strategy. |
listing-strategy |
指定如何确定新的/更新的实体。有关详细信息,请参阅每种策略的描述。 |
proxy-configuration-service |
指定代理配置控制器服务来代理网络请求。 |
record-writer |
指定用于创建列表的记录写入器。如果未指定,则将为列出的每个实体创建一个 FlowFile。如果指定了记录写入器,则所有实体都将写入单个 FlowFile,而不是向单个 FlowFiles 添加属性。 |
状态管理¶
范围 |
描述 |
|---|---|
CLUSTER |
处理器存储必要的数据,以便能够跟踪已经列出的文件。具体需要存储哪些数据取决于“列表策略”。 |
关系¶
名称 |
描述 |
|---|---|
success |
所有收到的 FlowFiles 都将路由至“success” |
写入属性¶
名称 |
描述 |
|---|---|
dropbox.id |
文件的 Dropbox 标识符 |
path |
文件所在的文件夹路径 |
filename |
文件的名称 |
dropbox.size |
文件大小 |
dropbox.timestamp |
文件的服务器修改时间 |
dropbox.revision |
文件的修订版 |