Snowflake Container Runtime¶
概述
The Snowflake Container Runtime is a set of preconfigured customizable environments built for machine learning on Snowpark Container Services, covering interactive experimentation and batch ML workloads such as model training, hyperparameter tuning, batch inference and fine tuning. They include the most popular machine learning and deep learning frameworks. Used with Snowflake notebooks, they provide an end-to-end ML experience.
执行环境
The Container Runtime provides an environment populated with packages and libraries that support a wide variety of ML development tasks inside Snowflake. In addition to the pre-installed packages, you can import packages from external sources like public PyPI repositories, or internally-hosted package repositories that provide a list of packages approved for use inside your organization.
Executions of your custom Python ML workloads and supported training APIs occur within Snowpark Container Services, which offers the ability to run on CPU or GPU compute pools. When using the Snowflake ML APIs, the Container Runtime distributes the processing across available resources.
Container Runtimes are versioned, allowing you to select specific runtime environments, pin your workloads to a specific version, and migrate to updated container runtime environments at your own pace.
分布式处理
Snowflake ML 建模和数据加载 APIs 建立在 Snowflake ML 的分布式处理框架之上,通过充分利用可用的计算能力来最大程度地提高资源利用率。默认情况下,该框架在多 GPU 节点上使用所有 GPUs,与开源的包相比,性能显著提高,并缩短了整体运行时间。

机器学习工作负载(包括数据加载)在 Snowflake 管理的计算环境中执行。该框架允许根据当前任务的具体要求动态扩展资源,例如训练模型或加载数据。每个任务的资源数量(包括 GPU 和内存分配)可通过提供的 APIs 轻松配置。
优化的数据加载
The Container Runtime provides a set of data connector APIs that enable connecting Snowflake data sources (including
tables, DataFrames, and Datasets) to popular ML frameworks such as PyTorch and TensorFlow, taking full advantage of
multiple cores or GPUs. Once loaded, the data can be processed using open source packages, or any of the Snowflake ML
APIs, including the distributed versions that are described below. These APIs are found in the snowflake.ml.data
namespace.
The :class:snowflake.ml.data.data_connector.DataConnector class connects Snowpark DataFrames or Snowflake ML Datasets to
TensorFlow or PyTorch DataSets or Pandas DataFrames. Instantiate a connector using one of the following class methods:
DataConnector.from_dataframe <snowflake.ml.data.data_connector.DataConnector.from_dataframe>: Accepts a Snowpark DataFrame.DataConnector.from_dataset <snowflake.ml.data.data_connector.DataConnector.from_dataset>: Accepts a Snowflake ML Dataset, specified by name and version.DataConnector.from_sources <snowflake.ml.data.data_connector.DataConnector.from_sources>: Accepts list of sources, each of which can be a DataFrame or a Dataset.
Once you have instantiated the connector (calling the instance, for example, data_connector), call the following
methods to produce the desired kind of output.
data_connector.to_tf_dataset: Returns a TensorFlow Dataset suitable for use with TensorFlow.data_connector.to_torch_dataset: Returns a PyTorch Dataset suitable for use with PyTorch.
For more information on these APIs, see the Snowflake ML API reference.
使用开源构建
With the foundational CPU and GPU images that come pre-populated with popular ML packages, and the flexibility to
install additional libraries using pip, users can employ familiar and innovative open source frameworks inside Snowflake
Notebooks, without moving data out of Snowflake. You can scale processing by using Snowflake’s distributed
APIs for data loading, training, and hyperparameter optimization, with the familiar APIs of popular OSS
packages, with small changes to the interface to allow for scaling configurations.
以下代码演示如何使用这些 APIs 创建 XGBoost 分类器:
CPU 容器运行时的包与 GPU 容器运行时的包不同。以下章节列出了每个容器运行时中可用的包。
Snowflake Container Runtime packages¶
The full list of available packages in Snowflake Container Runtime is maintained as part of the Container Runtime Release Notes.
优化的训练
Container Runtime offers a set of distributed training APIs, including distributed versions of LightGBM, PyTorch,
and XGBoost, that take full advantage of the available resources in the container environment. These are found in the
snowflake.ml.modeling.distributors namespace. The APIs of the distributed classes are similar to those of the
standard versions.
For more information on these APIs, see the API reference.
XGBoost¶
The primary XGBoost class is :class:snowflake.ml.modeling.distributors.xgboost.XGBEstimator. Related classes include:
- :class:
snowflake.ml.modeling.distributors.xgboost.XGBScalingConfig
For an example of working with this API, see the XGBoost on GPU (https://github.com/Snowflake-Labs/sfguide-getting-started-with-container-runtime-apis/blob/main/XGBoost_on_GPU_Quickstart.ipynb) example notebook in the Snowflake Container Runtime GitHub repository.
LightGBM¶
The primary LightGBM class is :class:snowflake.ml.modeling.distributors.lightgbm.LightGBMEstimator. Related classes include:
- :class:
snowflake.ml.modeling.distributors.lightgbm.LightGBMScalingConfig
For an example of working with this API, see the LightGBM on GPU (https://github.com/Snowflake-Labs/sfguide-getting-started-with-container-runtime-apis/blob/main/LightGBM_on_GPU_Quickstart.ipynb) example notebook in the Snowflake Container Runtime GitHub repository.
PyTorch¶
The primary PyTorch class is snowflake.ml.modeling.distributors.pytorch.PyTorchDistributor. Related classes and functions include:
- :class:
snowflake.ml.modeling.distributors.pytorch.WorkerResourceConfig - :class:
snowflake.ml.modeling.distributors.pytorch.PyTorchScalingConfig - :class:
snowflake.ml.modeling.distributors.pytorch.Context - :class:
snowflake.ml.modeling.distributors.pytorch.get_context
For an example of working with this API, see the PyTorch on GPU (https://github.com/Snowflake-Labs/sfguide-getting-started-with-container-runtime-apis/blob/main/PyTorch_on_GPU_Quickstart.ipynb) example notebook in the Snowflake Container Runtime GitHub repository.
后续步骤
- To try a Snowflake Notebook using Container Runtime, see Notebooks on Container Runtime.