关于 Openflow¶
Snowflake Openflow 是一项集成服务,可将任何数据源和任何目的地与数百个支持结构化和非结构化文本、图像、音频、视频和传感器数据的处理器连接起来。Openflow 基于 Apache NiFi (https://nifi.apache.org/) 构建,可让您在自己的云中运行完全托管的服务,实现完全控制。
备注
The Openflow platform is currently available for deployment in customers' own VPCs in both AWS and Snowpark Container Services.
This topic describes the key features of Openflow, its benefits, architecture, and workflow, and use cases.
主要功能和优点¶
- 开放且可扩展
An extensible managed service that's powered by Apache NiFi, enabling you to build and extend processors from any data source to any destination.
- 统一的数据集成平台
Openflow enables data engineers to handle complex, bi-directional data extraction and loading through a fully managed service that can be deployed inside your own VPC or within your Snowflake deployment.
- 企业级特性
Openflow offers out-of-the box security, compliance, and observability and maintainability hooks for data integration.
- 支持各种类型数据的高速引入
One unified platform lets you handle structured and unstructured data, in both batch and streaming modes, from your data source to Snowflake at virtually any scale.
- 持续引入多模态数据,助力 AI 处理
Nea real-time unstructured data ingestion, so you can immediately chat with your data coming from sources such as Sharepoint, Google Drive, and so on.
Openflow - Snowflake Deployment 模型¶
自带云 (BYOC) 和 Snowpark Container Services (SPCS) 版本同时支持 Openflow。
Openflow - Snowflake Deployment (SPCS)
Openflow - Snowflake Deployment,使用 Snowpark Container Services (SPCS),提供简化的集成连接解决方案。因为 SPCS 是 Snowflake 中完全独立的服务,它易于部署和管理,为运行数据流提供了一个方便且经济高效的环境。Openflow - Snowflake Deployment 的一个关键优势在于其与 Snowflake 安全模型的原生集成,可实现无缝的身份验证、授权和网络安全,并简化了操作。
Openflow BYOC
Openflow 自带云 (BYOC) 提供了一个连接解决方案,您可以使用该解决方案在组织云环境的安全范围内安全地连接公共和专用系统,并在本地处理敏感数据预处理。BYOC 指的是一种部署选项,在这种模式下,Openflow 的数据处理引擎(即数据平面)会在您自有的云环境中运行,而 Snowflake 负责管理整个 Openflow 服务及控制平面。
用例¶
Use Openflow if you want to fetch data from any source and put it in any destination with minimal management, coupled with Snowflake's built-in data security and governance.
Openflow 用例包括:
从非结构化数据源(例如 Google Drive 和 Box)中引入数据,将数据准备好,供 Snowflake Cortex 中的 AI 助手使用,或用于自定义处理任务。
将数据库表的变更数据捕获 (CDC) 复制到 Snowflake 中,以实现全面的集中报告。
将来自 Apache Kafka 等流媒体服务的实时事件引入 Snowflake,实现近实时分析。
Ingest data from SaaS platforms, such as LinkedIn Ads, to Snowflake for reporting, analytics, and insights.
Create an Openflow dataflow using Snowflake and NiFi processors and controller services.
安全¶
Openflow uses industry-leading security features that help ensure you have the highest levels of security for your account, and users, and all the data you store in Snowflake. Some key aspects include:
- 身份验证
Runtimes use OAuth2 for authentication to Snowflake.
- 授权
Openflow 支持 RBAC 的细粒度角色
ACCOUNTADMIN 可授予能够创建部署和运行时的权限
- 传输中加密
Openflow connectors support TLS protocol, using standard Snowflake clients for data ingestion.
All the communications between the Openflow deployments and Openflow control plane are encrypted using TLS protocol.
- 密钥管理 (BYOC)
与 AWS 密钥管理器或 Hashicorp Vault 集成。有关更多信息,请参阅 配置文件中的加密密码 (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#encrypt-config_tool)。
- 专用链接支持
Openflow connectors are compatible with reading and writing data to Snowflake using inbound AWS PrivateLink.
- Tri-Secret Secure 支持
Openflow 连接器与 Tri-Secret Secure 兼容,可用于向 Snowflake 写入数据。
架构¶
下图说明了 Openflow 的架构:

The deployment agent installs and bootstraps the Openflow deployment infrastructure in your VPC and regularly sync container images from the Snowflake system image registry.
Openflow components include:
- 部署
A deployment is where your data flows execute, within individual runtimes. You will often have multiple runtimes to isolate different projects, teams, or for SDLC reasons, all associated with a single deployment.
- 控制平面
The control plane is a layer in the architecture containing all components used to manage and observe, including the Openflow service and API, which users interact with via the Openflow UI or through direct interaction with Openflow APIs. On Openflow Snowflake Deployments the Control Plane (CP) consists of Snowflake-owned public cloud infrastructure/services and the control plane application itself.
- Openflow - Snowflake Deployment
Openflow - Snowflake Deployment 服务是使用 SPCS 计算池进行部署的,并根据其正常运行时和计算使用量产生使用费。有关更多信息,请参阅 Openflow Snowflake 部署成本和扩展注意事项。
运行时
Runtimes host your data pipelines, with the framework providing security, simplicity, and scalability. You can deploy Openflow runtimes in your VPC using Openflow. You can deploy Openflow connectors to your runtimes, and also build completely new pipelines using Openflow processors and controller services.
- Openflow - Snowflake Deployment 运行时
Openflow - Snowflake Deployment 运行时作为 Openflow - Snowflake Deployment 服务部署到 Openflow - Snowflake Deployment 部署中,由底层计算池表示。客户可通过部署请求运行时 Openflow - Snowflake Deployment,代表用户向服务执行请求。创建后,客户可通过 Web 浏览器访问为该特定 Openflow - Snowflake Deployment 服务生成的 URL。