Snowpark Migration Accelerator: 代码提取¶
Snowpark Migration Accelerator (SMA) 会处理指定目录中的所有文件。虽然它会为每个文件创建清单,但它仅针对具有特定扩展名的文件分析其中的 Spark API 引用。
有几种方法可以将文件添加到此目录。
先将所有相关的代码文件放置到一个目录中,再执行迁移过程。
要从现有环境(例如 Databricks)中提取笔记本,可以使用提取脚本来帮助完成迁移过程。
提取脚本¶
Snowflake provides publicly available extraction scripts that you can find on the Snowflake Labs GitHub page (https://github.com/Snowflake-Labs/SC.DDLExportScripts/tree/main). For Spark migrations, these scripts support various platforms.
Databricks¶
For Jupyter (.ipynb) or Databricks (.dbc) notebooks that run in Databricks, you can directly place them in a directory for SMA analysis without any extraction. To learn how to export your Databricks notebook files, visit the Databricks documentation here: https://docs.databricks.com/en/notebooks/notebook-export-import.html#export-notebooks (https://docs.databricks.com/en/notebooks/notebook-export-import.html#export-notebooks).
For an alternative approach, you can follow the instructions and use the scripts available in the Databricks folder of the SC.DDLExportScripts repository: https://github.com/Snowflake-Labs/SC.DDLExportScripts/tree/main/Databricks (https://github.com/Snowflake-Labs/SC.DDLExportScripts/tree/main/Databricks)
其他数据提取相关信息将后续补充。