Snowpark Migration Accelerator:版本说明¶
请注意,下方的版本说明按版本日期组织整理。下方将显示应用程序和转换核心的版本号。
Version 2.10.4 (November 18, 2025)¶
Application & CLI Version: 2.10.4¶
Included SMA Core Versions¶
Snowpark Conversion Core: 8.1.8
Engine Release Notes¶
已修复¶
Fixed an issue where the SMA generated corrupted Databricks notebook files in the output directory during Assessment mode execution.
Fixed an issue where the SMA would crash if the input directory contained folders named “SMA_ConvertedNotebooks”.
Version 2.10.3 (October 30, 2025)¶
Application & CLI Version: 2.10.3¶
Included SMA Core Versions¶
Snowpark Conversion Core: 8.1.7
Engine Release Notes¶
已添加¶
Added the Snowpark Connect readiness score. This new score measures the percentage of Spark API references in your codebase that are supported by Snowpark Connect for Spark.
This will now be the only score shown in assessment mode. To generate the Snowpark API Readiness Score, run the SMA in conversion mode.
Added support for SQL embedded migration for literal string concatenations assigned to a local variable in the same scope of execution.
Included scenarios now include:
sqlStat = "SELECT colName " + "FROM myTable" session.sql(sqlStat)
更改¶
Updated the EWI URLs in the Issues.csv inventory to point to the main Snowflake documentation site.
已修复¶
Fixed a code issue that caused inner project configuration files (e.g., pom.xml, build.sbt, build.gradle) to be incorrectly placed in the root of the output directory instead of the correct inner directories after migration.
Desktop Release Notes¶
已添加¶
Added the Snowpark Connect readiness score and updated the assessment execution flow.
When running the application in assessment mode, only the Snowpark Connect readiness score is now displayed.
When running the application in conversion mode, the Snowpark API readiness score is displayed (the Snowpark Connect Readiness will not be shown).
更改¶
Updated all in-application documentation links to point to the official Snowflake documentation, replacing the legacy SnowConvert (https://docs.snowconvert.com/sma) site.
Version 2.10.2 (Oct 27, 2025)¶
Application & CLI Version 2.10.2¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.73
已修复¶
Fixed an issue where the Snowpark Migration Accelerator failed converting DBC files into Jupyter Notebooks properly.
Version 2.10.1 (Oct 23, 2025)¶
Application & CLI Version 2.10.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.72
已添加¶
Added support for Snowpark Scala v1.17.0:
From Not Supported to Direct:
Dataset:
org.apache.spark.sql.Dataset.isEmpty→com.snowflake.snowpark.DataFrame.isEmpty
Row:
org.apache.spark.sql.Row.mkString→com.snowflake.snowpark.Row.mkString
StructType:
org.apache.spark.sql.types.StructType.fieldNames→com.snowflake.snowpark.types.StructType.fieldNames
From Not Supported to Rename:
Functions:
org.apache.spark.functions.flatten→com.snowflake.snowpark.functions.array_flatten
From Direct to Rename:
Functions:
org.apache.spark.functions.to_date→com.snowflake.snowpark.functions.try_to_dateorg.apache.spark.functions.to_timestamp→com.snowflake.snowpark.functions.try_to_timestamp
From Direct Helper to Rename:
Functions:
org.apache.spark.sql.functions.concat_ws→com.snowflake.snowpark.functions.concat_ws_ignore_nulls
From Not Defined to Direct:
Functions:
org.apache.spark.functions.try_to_timestamp→com.snowflake.snowpark.functions.try_to_timestampEmbedded SQL is now migrated when a SQL statement literal is assigned to a local variable.
Example: sqlStat = “SELECT colName FROM myTable" session.sql(sqlStat)
Embedded SQL is now supported for literal strings concatenations.
Example: session.sql(“SELECT colName " + "FROM myTable")
更改¶
Updated the supported versions of Snowpark Python API and Snowpark Pandas API from 1.36.0 to 1.39.0.
Updated the mapping status for the following PySpark xpath functions from NotSupported to Direct with EWI SPRKPY1103:
pyspark.sql.functions.xpathpyspark.sql.functions.xpath_booleanpyspark.sql.functions.xpath_doublepyspark.sql.functions.xpath_floatpyspark.sql.functions.xpath_intpyspark.sql.functions.xpath_longpyspark.sql.functions.xpath_numberpyspark.sql.functions.xpath_shortpyspark.sql.functions.xpath_string
Updated the mapping status for the following PySpark elements from NotDefined to Direct:
pyspark.sql.functions.bit_and→snowflake.snowpark.functions.bitand_aggpyspark.sql.functions.bit_or→snowflake.snowpark.functions.bitor_aggpyspark.sql.functions.bit_xor→snowflake.snowpark.functions.bitxor_aggpyspark.sql.functions.getbit→snowflake.snowpark.functions.getbit
Updated the mapping status for the following Pandas elements from NotSupported to Direct:
pandas.core.indexes.base.Index→modin.pandas.Indexpandas.core.indexes.base.Index.get_level_values→modin.pandas.Index.get_level_values
Updated the mapping status for the following PySpark functions from NotSupported to Rename:
pyspark.sql.functions.now→snowflake.snowpark.functions.current_timestamp
已修复¶
Fixed Scala not migrating imports when there's a rename.
Example:
Source code:
.. code-block:: scala
package com.example.functions
import org.apache.spark.sql.functions.{to_timestamp, lit}
object ToTimeStampTest extends App { to_timestamp(lit("sample")) to_timestamp(lit("sample"), "yyyy-MM-dd") }Output code:
.. code-block:: scala
package com.example.functions
import com.snowflake.snowpark.functions.{try_to_timestamp, lit} import com.snowflake.snowpark_extensions.Extensions._ import com.snowflake.snowpark_extensions.Extensions.functions._
object ToTimeStampTest extends App { try_to_timestamp(lit("sample")) try_to_timestamp(lit("sample"), "yyyy-MM-dd") }
Version 2.10.0 (Sep 24, 2025)¶
Application & CLI Version 2.10.0¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.62
已添加¶
Added functionality to migrate SQL embedded with Python format interpolation.
Added support for
DataFrame.selectandDataFrame.sorttransformations for greater data processing flexibility.
更改¶
Bumped the supported versions of Snowpark Python API and Snowpark Pandas API to 1.36.0.
Updated the mapping status of
pandas.core.frame.DataFrame.boxplotfrom Not Supported to Direct.Updated the mapping status of
DataFrame.select,Dataset.select,DataFrame.sortandDataset.sortfrom Direct to Transformation.Snowpark Scala allows a sequence of columns to be passed directly to the select and sort functions, so this transformation changes all the usages such as
df.select(cols: _*)todf.select(cols)anddf.sort(cols: _*)todf.sort(cols).Bumped Python AST and Parser version to 149.1.9.
Updated the status to Direct for pandas functions:
pandas.core.frame.DataFrame.to_excelpandas.core.series.Series.to_excelpandas.io.feather_format.read_featherpandas.io.orc.read_orcpandas.io.stata.read_stata
Updated the status for
pyspark.sql.pandas.map_ops.PandasMapOpsMixin.mapInPandasto workaround using the EWI SPRKPY1102.
已修复¶
Fixed issue that affected SqlEmbedded transformations when using chained method calls.
Fixed transformations involving PySqlExpr using the new PyLiteralSql to avoid losing Tails.
Resolved internal stability issues to improve tool robustness and reliability.
Version 2.7.7 (Aug 28, 2025)¶
Application & CLI Version 2.7.7¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.46
已添加¶
Added new Pandas EWI documentation PNDSPY1011.
Added support to the following Pandas functions:
pandas.core.algorithms.unique
pandas.core.dtypes.missing.isna
pandas.core.dtypes.missing.isnull
pandas.core.dtypes.missing.notna
pandas.core.dtypes.missing.notnull
pandas.core.resample.Resampler.count
pandas.core.resample.Resampler.max
pandas.core.resample.Resampler.mean
pandas.core.resample.Resampler.median
pandas.core.resample.Resampler.min
pandas.core.resample.Resampler.size
pandas.core.resample.Resampler.sum
pandas.core.arrays.timedeltas.TimedeltaArray.total_seconds
pandas.core.series.Series.get
pandas.core.series.Series.to_frame
pandas.core.frame.DataFrame.assign
pandas.core.frame.DataFrame.get
pandas.core.frame.DataFrame.to_numpy
pandas.core.indexes.base.Index.is_unique
pandas.core.indexes.base.Index.has_duplicates
pandas.core.indexes.base.Index.shape
pandas.core.indexes.base.Index.array
pandas.core.indexes.base.Index.str
pandas.core.indexes.base.Index.equals
pandas.core.indexes.base.Index.identical
pandas.core.indexes.base.Index.unique
Added support to the following Spark Scala functions:
org.apache.spark.sql.functions.format_number
org.apache.spark.sql.functions.from_unixtime
org.apache.spark.sql.functions.instr
org.apache.spark.sql.functions.months_between
org.apache.spark.sql.functions.pow
org.apache.spark.sql.functions.to_unix_timestamp
org.apache.spark.sql.Row.getAs
更改¶
Bumped the version of Snowpark Pandas API supported by the SMA to 1.33.0.
Bumped the version of Snowpark Scala API supported by the SMA to 1.16.0.
Updated the mapping status of pyspark.sql.group.GroupedData.pivot from Transformation to Direct.
Updated the mapping status of org.apache.spark.sql.Builder.master from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.
Updated the mapping status of org.apache.spark.sql.types.StructType.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.Row.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.SparkSession.stop from NotSupported to Rename. All the identified usages of this element are renamed to com.snowflake.snowpark.Session.close during code conversion.
Updated the mapping status of org.apache.spark.sql.DataFrame.unpersist and org.apache.spark.sql.Dataset.unpersist from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.
已修复¶
Fixed continuation backslash on removed tailed functions.
Fix the LIBRARY_PREFIX column in the ConversionStatusLibraries.csv file to use the right identifier for scikit-learn library family (scikit-*).
Fixed bug not parsing multiline grouped operations.
Version 2.9.0 (Sep 09, 2025)¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.53
已添加¶
The following mappings are now performed for
org.apache.spark.sql.Dataset[T]:org.apache.spark.sql.Dataset.unionis nowcom.snowflake.snowpark.DataFrame.unionAllorg.apache.spark.sql.Dataset.unionByNameis nowcom.snowflake.snowpark.DataFrame.unionAllByName
Added support for
org.apache.spark.sql.functions.broadcastas a transformation.
更改¶
Increased the supported Snowpark Python API version for SMA from
1.27.0to1.33.0.The status for the
pyspark.sql.function.randnfunction has been updated to Direct.
已修复¶
Resolved an issue where
org.apache.spark.SparkContext.parallelizewas not resolving and now supports it as a transformation.Fixed the
Dataset.persisttransformation to work with any type of Dataset, not justDataset[Row].
Version 2.7.6 (Jul 17, 2025)¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.30
已添加¶
Adjusted mappings for spark.DataReader methods.
DataFrame.unionis nowDataFrame.unionAll.DataFrame.unionByNameis nowDataFrame.unionAllByName.Added multi-level artifact dependency columns in artifact inventory
Added new Pandas EWIs documentation, from
PNDSPY1005toPNDSPY1010.Added a specific EWI for
pandas.core.series.Series.apply.
更改¶
Bumped the version of Snowpark Pandas API supported by the SMA from
1.27.0to1.30.0.
已修复¶
Fixed an issue with missing values in the formula to get the SQL readiness score.
Fixed a bug that was causing some Pandas elements to have the default EWI message from PySpark.
Version 2.7.5 (Jul 2, 2025)¶
Application & CLI Version 2.7.5¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.19
更改¶
Refactored Pandas Imports: Pandas imports now use `modin.pandas` instead of
snowflake.snowpark.modin.pandas.Improved `dbutils` and Magic Commands Transformation:
A new
sfutils.pyfile is now generated, and alldbutilsprefixes are replaced withsfutils.For Databricks (DBX) notebooks, an implicit import for
sfutilsis automatically added.The
sfutilsmodule simulates variousdbutilsmethods, including file system operations (dbutils.fs) via a defined Snowflake FileSystem (SFFS) stage, and handles notebook execution (dbutils.notebook.run) by transforming it toEXECUTE NOTEBOOKSQL functions.dbutils.notebook.exitis removed as it is not required in Snowflake.
已修复¶
Updates in SnowConvert Reports: SnowConvert reports now include the CellId column when instances originate from SMA, and the FileName column displays the full path.
Updated Artifacts Dependency for SnowConvert Reports: The SMA's artifact inventory report, which was previously impacted by the integration of SnowConvert, has been restored. This update enables the SMA tool to accurately capture and analyze Object References and Missing Object References directly from SnowConvert reports, thereby ensuring the correct retrieval of SQL dependencies for the inventory.
Version 2.7.4 (Jun 26, 2025)¶
Application & CLI Version 2.7.4¶
Desktop App
已添加¶
Added telemetry improvements.
已修复¶
Fix documentation links in conversion settings pop-up and Pandas EWIs.
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.16
已添加¶
Transforming Spark XML to Snowpark
Databricks SQL option in the SQL source language
Transform JDBC read connections.
更改¶
All the SnowConvert reports are copied to the backup Zip file.
The folder is renamed from
SqlReportstoSnowConvertReports.SqlFunctionsInventoryis moved to the folderReports.All the SnowConvert Reports are sent to Telemetry.
已修复¶
Non-deterministic issue with SQL Readiness Score.
Fixed a false-positive critical result that made the desktop crash.
Fixed issue causing the Artifacts dependency report not to show the SQL objects.
Version 2.7.2 (Jun 10, 2025)¶
Application & CLI Version 2.7.2¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.2
已修复¶
Addressed an issue with SMA execution on the latest Windows OS, as previously reported. This fix resolves the issues encountered in version 2.7.1.
Version 2.7.1 (Jun 9, 2025)¶
Application & CLI Version 2.7.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.1
已添加¶
The Snowpark Migration Accelerator (SMA) now orchestrates SnowConvert (https://docs.snowconvert.com/sc/general/about) to process SQL found in user workloads, including embedded SQL in Python / Scala code, Notebook SQL cells, .sql files, and .hql files.
The SnowConvert now enhances the previous SMA capabilities:
Spark SQL (https://docs.snowconvert.com/sc/translation-references/spark-dbx)
A new folder in the Reports called SQL Reports contains the reports generated by SnowConvert.
Known Issues¶
The previous SMA version for SQL reports will appear empty for the following:
For
Reports/SqlElementsInventory.csv, partially covered by theReports/SqlReports/Elements.yyyymmdd.hhmmss.csv.For
Reports/SqlFunctionsInventory.csvrefer to the new location with the same name atReports/SqlReports/SqlFunctionsInventory.csv
The artifact dependency inventory:
In the
ArtifactDependencyInventorythe column for the SQL Object will appear empty
Version 2.6.10 (May 5, 2025)¶
Application & CLI Version 2.6.10¶
Included SMA Core Versions¶
Snowpark Conversion Core 7.4.0
已修复¶
Fixed wrong values in the 'checkpoints.json' file.
The 'sample' value was without decimals (for integer values) and quotes.
The 'entryPoint' value had dots instead of slashes and was missing the file extension.
Updated the default value to TRUE for the setting 'Convert DBX notebooks to Snowflake notebooks'
Version 2.6.8 (Apr 28, 2025)¶
Application & CLI Version 2.6.8¶
Desktop App¶
Added checkpoints execution settings mechanism recognition.
Added a mechanism to collect DBX magic commands into DbxElementsInventory.csv
Added 'checkpoints.json' generation into the input directory.
Added a new EWI for all not supported magic command.
Added the collection of dbutils into DbxElementsInventory.csv from scala source notebooks
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.53
更改¶
Updates made to handle transformations from DBX Scala elements to Jupyter Python elements, and to comment the entire code from the cell.
Updates made to handle transformations from dbutils.notebook.run and “r" commands, for the last one, also comment out the entire code from the cell.
Updated the name and the letter of the key to make the conversion of the notebook files.
已修复¶
Fixed the bug that was causing the transformation of DBX notebooks into .ipynb files to have the wrong format.
Fixed the bug that was causing .py DBX notebooks to not be transformable into .ipynb files.
Fixed a bug that was causing comments to be missing in the output code of DBX notebooks.
Fixed a bug that was causing raw Scala files to be converted into ipynb files.
Version 2.6.7 (Apr 21, 2025)¶
Application & CLI Version 2.6.7¶
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.42
更改¶
Updated DataFramesInventory to fill EntryPoints column
Version 2.6.6 (Apr 7, 2025)¶
Application & CLI Version 2.6.6¶
Desktop App¶
已添加¶
Update DBx EWI link in the UI results page
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.39
已添加¶
Added Execution Flow inventory generation.
Added implicit session setup in every DBx notebook transformation
更改¶
Renamed the DbUtilsUsagesInventory.csv to DbxElementsInventory.csv
已修复¶
Fixed a bug that caused a Parsing error when a backslash came after a type hint.
Fixed relative imports that do not start with a dot and relative imports with a star.
Version 2.6.5 (Mar 27, 2025)¶
Application & CLI Version 2.6.5¶
Desktop App¶
已添加¶
Added a new conversion setting toggle to enable or disable Sma-Checkpoints feature.
Fix report issue to not crash when post api returns 500
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.26
已添加¶
Added generation of the checkpoints.json file into the output folder based on the DataFramesInventory.csv.
Added "disableCheckpoints" flag into the CLI commands and additional parameters of the code processor.
Added a new replacer for Python to transform the dbutils.notebook.run node.
Added new replacers to transform the magic %run command.
Added new replacers (Python and Scala) to remove the dbutils.notebook.exit node.
Added Location column to artifacts inventory.
更改¶
Refactored the normalized directory separator used in some parts of the solution.
Centralized the DBC extraction working folder name handling.
Updated Snowpark and Pandas version to v1.27.0
Updated the artifacts inventory columns to:
Name -> Dependency
File -> FileId
Status -> Status_detail
Added new column to the artifacts inventory:
Success
已修复¶
Dataframes inventory was not being uploaded to the stage correctly.
Version 2.6.4 (Mar 12, 2025)¶
Application & CLI Version 2.6.4¶
Included SMA Core Versions ¶
Snowpark Conversion Core 7.2.0
Added ¶
An Artifact Dependency Inventory
A replacer and EWI for pyspark.sql.types.StructType.fieldNames method to snowflake.snowpark.types.StructType.fieldNames attribute.
The following PySpark functions with the status:
Direct Status
pyspark.sql.functions.bitmap_bit_positionpyspark.sql.functions.bitmap_bucket_numberpyspark.sql.functions.bitmap_construct_aggpyspark.sql.functions.equal_nullpyspark.sql.functions.ifnullpyspark.sql.functions.localtimestamppyspark.sql.functions.max_bypyspark.sql.functions.min_bypyspark.sql.functions.nvlpyspark.sql.functions.regr_avgxpyspark.sql.functions.regr_avgypyspark.sql.functions.regr_countpyspark.sql.functions.regr_interceptpyspark.sql.functions.regr_slopepyspark.sql.functions.regr_sxxpyspark.sql.functions.regr_sxypyspark.sql.functions.regr
NotSupported
pyspark.sql.functions.map_contains_keypyspark.sql.functions.positionpyspark.sql.functions.regr_r2pyspark.sql.functions.try_to_binary
The following Pandas functions with status
pandas.core.series.Series.str.ljustpandas.core.series.Series.str.centerpandas.core.series.Series.str.padpandas.core.series.Series.str.rjust
Update the following Pyspark functions with the status
From WorkAround to Direct
pyspark.sql.functions.acoshpyspark.sql.functions.asinhpyspark.sql.functions.atanhpyspark.sql.functions.instrpyspark.sql.functions.log10pyspark.sql.functions.log1ppyspark.sql.functions.log2
From NotSupported to Direct
pyspark.sql.functions.bit_lengthpyspark.sql.functions.cbrtpyspark.sql.functions.nth_valuepyspark.sql.functions.octet_lengthpyspark.sql.functions.base64pyspark.sql.functions.unbase64
Updated the folloing Pandas functions with the status
From NotSupported to Direct
pandas.core.frame.DataFrame.poppandas.core.series.Series.betweenpandas.core.series.Series.pop
Version 2.6.3 (Mar 6, 2025)¶
Application & CLI Version 2.6.3¶
Included SMA Core Versions ¶
Snowpark Conversion Core 7.1.13
Added ¶
Added csv generator class for new inventory creation.
Added "full_name" column to import usages inventory.
Added transformation from pyspark.sql.functions.concat_ws to snowflake.snowpark.functions._concat_ws_ignore_nulls.
Added logic for generation of checkpoints.json.
Added the inventories:
DataFramesInventory.csv.
CheckpointsInventory.csv
Version 2.6.0 (Feb 21, 2025)¶
Application & CLI Version 2.6.0¶
Desktop App ¶
Updated the licensing agreement, acceptance is required.
Included SMA Core Versions¶
Snowpark Conversion Core 7.1.2
已添加
Updated the mapping status for the following PySpark elements, from NotSupported to Direct
pyspark.sql.types.ArrayType.jsonpyspark.sql.types.ArrayType.jsonValuepyspark.sql.types.ArrayType.simpleStringpyspark.sql.types.ArrayType.typeNamepyspark.sql.types.AtomicType.jsonpyspark.sql.types.AtomicType.jsonValuepyspark.sql.types.AtomicType.simpleStringpyspark.sql.types.AtomicType.typeNamepyspark.sql.types.BinaryType.jsonpyspark.sql.types.BinaryType.jsonValuepyspark.sql.types.BinaryType.simpleStringpyspark.sql.types.BinaryType.typeNamepyspark.sql.types.BooleanType.jsonpyspark.sql.types.BooleanType.jsonValuepyspark.sql.types.BooleanType.simpleStringpyspark.sql.types.BooleanType.typeNamepyspark.sql.types.ByteType.jsonpyspark.sql.types.ByteType.jsonValuepyspark.sql.types.ByteType.simpleStringpyspark.sql.types.ByteType.typeNamepyspark.sql.types.DecimalType.jsonpyspark.sql.types.DecimalType.jsonValuepyspark.sql.types.DecimalType.simpleStringpyspark.sql.types.DecimalType.typeNamepyspark.sql.types.DoubleType.jsonpyspark.sql.types.DoubleType.jsonValuepyspark.sql.types.DoubleType.simpleStringpyspark.sql.types.DoubleType.typeNamepyspark.sql.types.FloatType.jsonpyspark.sql.types.FloatType.jsonValuepyspark.sql.types.FloatType.simpleStringpyspark.sql.types.FloatType.typeNamepyspark.sql.types.FractionalType.jsonpyspark.sql.types.FractionalType.jsonValuepyspark.sql.types.FractionalType.simpleStringpyspark.sql.types.FractionalType.typeNamepyspark.sql.types.IntegerType.jsonpyspark.sql.types.IntegerType.jsonValuepyspark.sql.types.IntegerType.simpleStringpyspark.sql.types.IntegerType.typeNamepyspark.sql.types.IntegralType.jsonpyspark.sql.types.IntegralType.jsonValuepyspark.sql.types.IntegralType.simpleStringpyspark.sql.types.IntegralType.typeNamepyspark.sql.types.LongType.jsonpyspark.sql.types.LongType.jsonValuepyspark.sql.types.LongType.simpleStringpyspark.sql.types.LongType.typeNamepyspark.sql.types.MapType.jsonpyspark.sql.types.MapType.jsonValuepyspark.sql.types.MapType.simpleStringpyspark.sql.types.MapType.typeNamepyspark.sql.types.NullType.jsonpyspark.sql.types.NullType.jsonValuepyspark.sql.types.NullType.simpleStringpyspark.sql.types.NullType.typeNamepyspark.sql.types.NumericType.jsonpyspark.sql.types.NumericType.jsonValuepyspark.sql.types.NumericType.simpleStringpyspark.sql.types.NumericType.typeNamepyspark.sql.types.ShortType.jsonpyspark.sql.types.ShortType.jsonValuepyspark.sql.types.ShortType.simpleStringpyspark.sql.types.ShortType.typeNamepyspark.sql.types.StringType.jsonpyspark.sql.types.StringType.jsonValuepyspark.sql.types.StringType.simpleStringpyspark.sql.types.StringType.typeNamepyspark.sql.types.StructType.jsonpyspark.sql.types.StructType.jsonValuepyspark.sql.types.StructType.simpleStringpyspark.sql.types.StructType.typeNamepyspark.sql.types.TimestampType.jsonpyspark.sql.types.TimestampType.jsonValuepyspark.sql.types.TimestampType.simpleStringpyspark.sql.types.TimestampType.typeNamepyspark.sql.types.StructField.simpleStringpyspark.sql.types.StructField.typeNamepyspark.sql.types.StructField.jsonpyspark.sql.types.StructField.jsonValuepyspark.sql.types.DataType.jsonpyspark.sql.types.DataType.jsonValuepyspark.sql.types.DataType.simpleStringpyspark.sql.types.DataType.typeNamepyspark.sql.session.SparkSession.getActiveSessionpyspark.sql.session.SparkSession.versionpandas.io.html.read_htmlpandas.io.json._normalize.json_normalizepyspark.sql.types.ArrayType.fromJsonpyspark.sql.types.MapType.fromJsonpyspark.sql.types.StructField.fromJsonpyspark.sql.types.StructType.fromJsonpandas.core.groupby.generic.DataFrameGroupBy.pct_changepandas.core.groupby.generic.SeriesGroupBy.pct_change
Updated the mapping status for the following Pandas elements, from NotSupported to Direct
pandas.io.html.read_htmlpandas.io.json._normalize.json_normalizepandas.core.groupby.generic.DataFrameGroupBy.pct_changepandas.core.groupby.generic.SeriesGroupBy.pct_change
Updated the mapping status for the following PySpark elements, from Rename to Direct
pyspark.sql.functions.collect_listpyspark.sql.functions.size
Fixed ¶
Standardized the format of the version number in the inventories.
Version 2.5.2 (Feb 5, 2025)¶
修补程序:应用程序和 CLI 版本 2.5.2¶
Desktop App¶
修复了在示例项目选项中进行转换时出现的问题。
Included SMA Core Versions¶
Snowpark Conversion Core 5.3.0
Version 2.5.1 (Feb 4, 2025)¶
应用程序和 CLI 版本 2.5.1¶
Desktop App¶
添加了在用户无写入权限时适用的新模式。
更新了许可协议,用户需要接受此协议。
CLI¶
修复了显示“--version”或“-v”时,CLI 屏幕中年份的问题
包含 SMA 核心版本 included-sma-core-versions¶
Snowpark Conversion Core 5.3.0
已添加¶
Added the following Python Third-Party libraries with Direct status:
about-timeaffinegapaiohappyeyeballsalibi-detectalive-progressallure-nose2allure-robotframeworkanaconda-cloud-clianaconda-mirrorastropy-iers-dataasynchasyncsshautotsautovimlaws-msk-iam-sasl-signer-pythonazure-functionsbackports.tarfileblasbottlebsoncairocapnprotocaptumcategorical-distancecensusclickhouse-driverclustergramcmaconda-anaconda-telemetryconfigspacecpp-expecteddask-exprdata-science-utilsdatabricks-sdkdatetime-distancedb-dtypesdedupededupe-variable-datetimededupe_lehvenshtein_searchdedupe_levenshtein_searchdiff-coverdiptestdmglibdocstring_parserdoublemetaphonedspy-aieconmlemceeemojienvironseth-abieth-hasheth-typingeth-utilsexpatfiletypefitterflask-corsfpdf2frozendictgcabgeojsongettextglib-toolsgoogle-adsgoogle-ai-generativelanguagegoogle-api-python-clientgoogle-auth-httplib2google-cloud-bigquerygoogle-cloud-bigquery-coregoogle-cloud-bigquery-storagegoogle-cloud-bigquery-storage-coregoogle-cloud-resource-managergoogle-generativeaigooglemapsgraphemegraphenegraphql-relaygravisgreykitegrpc-google-iam-v1harfbuzzhatch-fancy-pypi-readmehaversinehiclasshicolor-icon-themehigheredhmmlearnholidays-exthttplib2icuimbalanced-ensembleimmutabledictimportlib-metadataimportlib-resourcesinquirerpyiterative-telemetryjaraco.contextjaraco.testjiterjiwerjoserfcjsoncppjsonpathjsonpath-ngjsonpath-pythonkagglehubkeplerglkt-legacylangchain-communitylangchain-experimentallangchain-snowflakelangchain-text-splitterslibabseillibflaclibgfortran-nglibgfortran5libgliblibgomplibgrpclibgsflibmagiclibogglibopenblaslibpostallibprotobuflibsentencepiecelibsndfilelibstdcxx-nglibtheoralibtifflibvorbislibwebplightweight-mmmlitestarlitestar-with-annotated-typeslitestar-with-attrslitestar-with-cryptographylitestar-with-jinjalitestar-with-jwtlitestar-with-prometheuslitestar-with-structloglunarcalendar-extmatplotlib-vennmetricksmimesismodin-raymomepympg123msgspecmsgspec-tomlmsgspec-yamlmsitoolsmultipartnamexnbconvert-allnbconvert-corenbconvert-pandocnlohmann_jsonnumba-cudanumpyrooffice365-rest-python-clientopenapi-pydanticopentelemetry-distroopentelemetry-instrumentationopentelemetry-instrumentation-system-metricsoptreeosmnxpathlibpdf2imagepfzypgpyplumbumpm4pypolarspolyfactorypoppler-cpppostalpre-commitprompt-toolkitpropcachepy-partiql-parserpy_stringmatchingpyatlanpyfakefspyfhelpyhacrf-datamadepyicebergpykrb5pylbfgspymilvuspymoopynisherpyomopypdfpypdf-with-cryptopypdf-with-fullpypdf-with-imagepypngpyprindpyrfrpysoundfilepytest-codspeedpytest-triopython-barcodepython-boxpython-docxpython-gssapipython-iso639python-magicpython-pandocpython-zstdpyucapyvinecopulibpyxirrqrcoderai-sdkray-clientray-observabilityreadlinerich-clickrouge-scoreruffscikit-criteriascikit-mobilitysentencepiece-pythonsentencepiece-spmsetuptools-markdownsetuptools-scmsetuptools-scm-git-archiveshareplumsimdjsonsimplecosinesis-extrasslack-sdksmacsnowflake-sqlalchemysnowflake_legacysocrata-pyspdlogsphinxcontrib-imagessphinxcontrib-jquerysphinxcontrib-youtubesplunk-opentelemetrysqlfluffsquarifyst-themestatisticsstreamlit-antd-componentsstreamlit-condition-treestreamlit-echartsstreamlit-feedbackstreamlit-keplerglstreamlit-mermaidstreamlit-navigation-barstreamlit-option-menustrictyamlstringdistsybiltensorflow-cputensorflow-texttiledb-ptorchaudiotorchevaltrio-websockettrulens-connectors-snowflaketrulens-coretrulens-dashboardtrulens-feedbacktrulens-otel-semconvtrulens-providers-cortextsdownsampletypingtyping-extensionstyping_extensionsunittest-xml-reportinguritemplateusuuid6wfdbwsprotozlibzope.index
Added the following Python BuiltIn libraries with Direct status:
aifcarrayastasynchatasyncioasyncoreatexitaudioopbase64bdbbinasciibitsectbuiltinsbz2calendarcgicgitbchunkcmathcmdcodecodecscodeopcolorsyscompileallconcurrentcontextlibcontextvarscopycopyregcprofilecryptcsvctypescursesdbmdifflibdisdistutilsdoctestemailensurepipenumerrnofaulthandlerfcntlfilecmpfileinputfnmatchfractionsftplibfunctoolsgcgetoptgetpassgettextgraphlibgrpgziphashlibheapqhmachtmlhttpidlelibimaplibimghdrimpimportlibinspectipaddressitertoolskeywordlinecachelocalelzmamailboxmailcapmarshalmathmimetypesmmapmodulefindermsilibmultiprocessingnetrcnisnntplibnumbersoperatoroptparseossaudiodevpdbpicklepickletoolspipespkgutilplatformplistlibpoplibposixpprintprofilepstatsptypwdpy_compilepyclbrpydocqueuequoprirandomrereprlibresourcerlcompleterrunpyschedsecretsselectselectorsshelveshlexsignalsitesitecustomizesmtpdsmtplibsndhdrsocketsocketserverspwdsqlite3sslstatstringstringprepstructsubprocesssunausymtablesysconfigsyslogtabnannytarfiletelnetlibtempfiletermiostesttextwrapthreadingtimeittkintertokentokenizetomllibtracetracebacktracemallocttyturtleturtledemotypesunicodedataurllibuuuuidvenvwarningswaveweakrefwebbrowserwsgirefxdrlibxmlxmlrpczipappzipfilezipimportzoneinfo
Added the following Python BuiltIn libraries with NotSupported status:
msvcrtwinregwinsound
更改¶
将 .NET 版本更新到 v9.0.0。
已改进 EWI SPRKPY1068。
将 SMA 支持的 Snowpark Python API 版本从 1.24.0 升级至 1.25.0。
更新了详细报告模板,现在包含适用于 Pandas 的 Snowpark 版本。
将以下库从 ThirdPartyLib 更改为 BuiltIn。
configparserdataclassespathlibreadlinestatisticszlib
Updated the mapping status for the following Pandas elements, from Direct to Partial:
pandas.core.frame.DataFrame.addpandas.core.frame.DataFrame.aggregatepandas.core.frame.DataFrame.allpandas.core.frame.DataFrame.applypandas.core.frame.DataFrame.astypepandas.core.frame.DataFrame.cumsumpandas.core.frame.DataFrame.divpandas.core.frame.DataFrame.dropnapandas.core.frame.DataFrame.eqpandas.core.frame.DataFrame.ffillpandas.core.frame.DataFrame.fillnapandas.core.frame.DataFrame.floordivpandas.core.frame.DataFrame.gepandas.core.frame.DataFrame.groupbypandas.core.frame.DataFrame.gtpandas.core.frame.DataFrame.idxmaxpandas.core.frame.DataFrame.idxminpandas.core.frame.DataFrame.infpandas.core.frame.DataFrame.joinpandas.core.frame.DataFrame.lepandas.core.frame.DataFrame.locpandas.core.frame.DataFrame.ltpandas.core.frame.DataFrame.maskpandas.core.frame.DataFrame.mergepandas.core.frame.DataFrame.modpandas.core.frame.DataFrame.mulpandas.core.frame.DataFrame.nepandas.core.frame.DataFrame.nuniquepandas.core.frame.DataFrame.pivot_tablepandas.core.frame.DataFrame.powpandas.core.frame.DataFrame.raddpandas.core.frame.DataFrame.rankpandas.core.frame.DataFrame.rdivpandas.core.frame.DataFrame.renamepandas.core.frame.DataFrame.replacepandas.core.frame.DataFrame.resamplepandas.core.frame.DataFrame.rfloordivpandas.core.frame.DataFrame.rmodpandas.core.frame.DataFrame.rmulpandas.core.frame.DataFrame.rollingpandas.core.frame.DataFrame.roundpandas.core.frame.DataFrame.rpowpandas.core.frame.DataFrame.rsubpandas.core.frame.DataFrame.rtruedivpandas.core.frame.DataFrame.shiftpandas.core.frame.DataFrame.skewpandas.core.frame.DataFrame.sort_indexpandas.core.frame.DataFrame.sort_valuespandas.core.frame.DataFrame.subpandas.core.frame.DataFrame.to_dictpandas.core.frame.DataFrame.transformpandas.core.frame.DataFrame.transposepandas.core.frame.DataFrame.truedivpandas.core.frame.DataFrame.varpandas.core.indexes.datetimes.date_rangepandas.core.reshape.concat.concatpandas.core.reshape.melt.meltpandas.core.reshape.merge.mergepandas.core.reshape.pivot.pivot_tablepandas.core.reshape.tile.cutpandas.core.series.Series.addpandas.core.series.Series.aggregatepandas.core.series.Series.allpandas.core.series.Series.anypandas.core.series.Series.cumsumpandas.core.series.Series.divpandas.core.series.Series.dropnapandas.core.series.Series.eqpandas.core.series.Series.ffillpandas.core.series.Series.fillnapandas.core.series.Series.floordivpandas.core.series.Series.gepandas.core.series.Series.gtpandas.core.series.Series.ltpandas.core.series.Series.maskpandas.core.series.Series.modpandas.core.series.Series.mulpandas.core.series.Series.multiplypandas.core.series.Series.nepandas.core.series.Series.powpandas.core.series.Series.quantilepandas.core.series.Series.raddpandas.core.series.Series.rankpandas.core.series.Series.rdivpandas.core.series.Series.renamepandas.core.series.Series.replacepandas.core.series.Series.resamplepandas.core.series.Series.rfloordivpandas.core.series.Series.rmodpandas.core.series.Series.rmulpandas.core.series.Series.rollingpandas.core.series.Series.rpowpandas.core.series.Series.rsubpandas.core.series.Series.rtruedivpandas.core.series.Series.samplepandas.core.series.Series.shiftpandas.core.series.Series.skewpandas.core.series.Series.sort_indexpandas.core.series.Series.sort_valuespandas.core.series.Series.stdpandas.core.series.Series.subpandas.core.series.Series.subtractpandas.core.series.Series.truedivpandas.core.series.Series.value_countspandas.core.series.Series.varpandas.core.series.Series.wherepandas.core.tools.numeric.to_numeric
Updated the mapping status for the following Pandas elements, from NotSupported to Direct:
pandas.core.frame.DataFrame.attrspandas.core.indexes.base.Index.to_numpypandas.core.series.Series.str.lenpandas.io.html.read_htmlpandas.io.xml.read_xmlpandas.core.indexes.datetimes.DatetimeIndex.meanpandas.core.resample.Resampler.indicespandas.core.resample.Resampler.nuniquepandas.core.series.Series.itemspandas.core.tools.datetimes.to_datetimepandas.io.sas.sasreader.read_saspandas.core.frame.DataFrame.attrspandas.core.frame.DataFrame.stylepandas.core.frame.DataFrame.itemspandas.core.groupby.generic.DataFrameGroupBy.headpandas.core.groupby.generic.DataFrameGroupBy.medianpandas.core.groupby.generic.DataFrameGroupBy.minpandas.core.groupby.generic.DataFrameGroupBy.nuniquepandas.core.groupby.generic.DataFrameGroupBy.tailpandas.core.indexes.base.Index.is_booleanpandas.core.indexes.base.Index.is_floatingpandas.core.indexes.base.Index.is_integerpandas.core.indexes.base.Index.is_monotonic_decreasingpandas.core.indexes.base.Index.is_monotonic_increasingpandas.core.indexes.base.Index.is_numericpandas.core.indexes.base.Index.is_objectpandas.core.indexes.base.Index.maxpandas.core.indexes.base.Index.minpandas.core.indexes.base.Index.namepandas.core.indexes.base.Index.namespandas.core.indexes.base.Index.renamepandas.core.indexes.base.Index.set_namespandas.core.indexes.datetimes.DatetimeIndex.day_namepandas.core.indexes.datetimes.DatetimeIndex.month_namepandas.core.indexes.datetimes.DatetimeIndex.timepandas.core.indexes.timedeltas.TimedeltaIndex.ceilpandas.core.indexes.timedeltas.TimedeltaIndex.dayspandas.core.indexes.timedeltas.TimedeltaIndex.floorpandas.core.indexes.timedeltas.TimedeltaIndex.microsecondspandas.core.indexes.timedeltas.TimedeltaIndex.nanosecondspandas.core.indexes.timedeltas.TimedeltaIndex.roundpandas.core.indexes.timedeltas.TimedeltaIndex.secondspandas.core.reshape.pivot.crosstabpandas.core.series.Series.dt.roundpandas.core.series.Series.dt.timepandas.core.series.Series.dt.weekdaypandas.core.series.Series.is_monotonic_decreasingpandas.core.series.Series.is_monotonic_increasing
Updated the mapping status for the following Pandas elements, from NotSupported to Partial:
pandas.core.frame.DataFrame.alignpandas.core.series.Series.alignpandas.core.frame.DataFrame.tz_convertpandas.core.frame.DataFrame.tz_localizepandas.core.groupby.generic.DataFrameGroupBy.fillnapandas.core.groupby.generic.SeriesGroupBy.fillnapandas.core.indexes.datetimes.bdate_rangepandas.core.indexes.datetimes.DatetimeIndex.stdpandas.core.indexes.timedeltas.TimedeltaIndex.meanpandas.core.resample.Resampler.asfreqpandas.core.resample.Resampler.quantilepandas.core.series.Series.mappandas.core.series.Series.tz_convertpandas.core.series.Series.tz_localizepandas.core.window.expanding.Expanding.countpandas.core.window.rolling.Rolling.countpandas.core.groupby.generic.DataFrameGroupBy.aggregatepandas.core.groupby.generic.SeriesGroupBy.aggregatepandas.core.frame.DataFrame.applymappandas.core.series.Series.applypandas.core.groupby.generic.DataFrameGroupBy.bfillpandas.core.groupby.generic.DataFrameGroupBy.ffillpandas.core.groupby.generic.SeriesGroupBy.bfillpandas.core.groupby.generic.SeriesGroupBy.ffillpandas.core.frame.DataFrame.backfillpandas.core.frame.DataFrame.bfillpandas.core.frame.DataFrame.comparepandas.core.frame.DataFrame.unstackpandas.core.frame.DataFrame.asfreqpandas.core.series.Series.backfillpandas.core.series.Series.bfillpandas.core.series.Series.comparepandas.core.series.Series.unstackpandas.core.series.Series.asfreqpandas.core.series.Series.argmaxpandas.core.series.Series.argminpandas.core.indexes.accessors.CombinedDatetimelikeProperties.microsecondpandas.core.indexes.accessors.CombinedDatetimelikeProperties.nanosecondpandas.core.indexes.accessors.CombinedDatetimelikeProperties.day_namepandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_namepandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_leap_yearpandas.core.indexes.accessors.CombinedDatetimelikeProperties.floorpandas.core.indexes.accessors.CombinedDatetimelikeProperties.ceilpandas.core.groupby.generic.DataFrameGroupBy.idxmaxpandas.core.groupby.generic.DataFrameGroupBy.idxminpandas.core.groupby.generic.DataFrameGroupBy.stdpandas.core.indexes.timedeltas.TimedeltaIndex.meanpandas.core.tools.timedeltas.to_timedelta
已知问题¶
此版本包含一个问题,导致无法在此版本中进行示例项目转换, 这将在下一个版本中修复
Version 2.4.3 (Jan 9, 2025)¶
应用程序和 CLI 版本 2.4.3¶
Desktop App¶
崩溃报告模式中新增故障排除指南链接。
Included SMA Core Versions¶
Snowpark Conversion Core 4.15.0
已添加¶
在 ConversionStatusPySpark.csv 文件中将以下 PySpark 元素添加为
NotSupported:pyspark.sql.streaming.readwriter.DataStreamReader.tablepyspark.sql.streaming.readwriter.DataStreamReader.schemapyspark.sql.streaming.readwriter.DataStreamReader.optionspyspark.sql.streaming.readwriter.DataStreamReader.optionpyspark.sql.streaming.readwriter.DataStreamReader.loadpyspark.sql.streaming.readwriter.DataStreamReader.formatpyspark.sql.streaming.query.StreamingQuery.awaitTerminationpyspark.sql.streaming.readwriter.DataStreamWriter.partitionBypyspark.sql.streaming.readwriter.DataStreamWriter.toTablepyspark.sql.streaming.readwriter.DataStreamWriter.triggerpyspark.sql.streaming.readwriter.DataStreamWriter.queryNamepyspark.sql.streaming.readwriter.DataStreamWriter.outputModepyspark.sql.streaming.readwriter.DataStreamWriter.formatpyspark.sql.streaming.readwriter.DataStreamWriter.optionpyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatchpyspark.sql.streaming.readwriter.DataStreamWriter.start
更改¶
更新了 Hive SQL EWIs 格式。
SPRKHVSQL1001
SPRKHVSQL1002
SPRKHVSQL1003
SPRKHVSQL1004
SPRKHVSQL1005
SPRKHVSQL1006
更新了 Spark SQL EWIs 格式。
SPRKSPSQL1001
SPRKSPSQL1002
SPRKSPSQL1003
SPRKSPSQL1004
SPRKSPSQL1005
SPRKSPSQL1006
已修复¶
修复了导致该工具无法识别某些 PySpark 元素的错误。
修复了 ThirdParty 标识的调用和 ThirdParty 导入的调用数量不匹配的问题。
Version 2.4.2 (Dec 13, 2024)¶
Application & CLI Version 2.4.2¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.14.0
新增 added¶
在 ConversionStatusPySpark.csv 中添加了以下 Spark 元素:
pyspark.broadcast.Broadcast.valuepyspark.conf.SparkConf.getAllpyspark.conf.SparkConf.setAllpyspark.conf.SparkConf.setMasterpyspark.context.SparkContext.addFilepyspark.context.SparkContext.addPyFilepyspark.context.SparkContext.binaryFilespyspark.context.SparkContext.setSystemPropertypyspark.context.SparkContext.versionpyspark.files.SparkFilespyspark.files.SparkFiles.getpyspark.rdd.RDD.countpyspark.rdd.RDD.distinctpyspark.rdd.RDD.reduceByKeypyspark.rdd.RDD.saveAsTextFilepyspark.rdd.RDD.takepyspark.rdd.RDD.zipWithIndexpyspark.sql.context.SQLContext.udfpyspark.sql.types.StructType.simpleString
更改¶
更新了 Pandas EWIs 的文档,
PNDSPY1001、PNDSPY1002和PNDSPY1003SPRKSCL1137,使其与标准化格式保持一致,确保了所有 EWIs 的一致性和清晰度。更新了以下 Scala EWIs 的文档:
SPRKSCL1106和SPRKSCL1107。与标准化格式保持一致,从而确保所有 EWIs 的一致性和清晰度。
已修复¶
已修复导致 UserDefined 符号在第三方使用情况清单中显示的错误。
Version 2.4.1 (Dec 4, 2024)¶
Application & CLI Version 2.4.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.13.1
Command Line Interface¶
已更改
为输出文件夹添加了时间戳。
Snowpark Conversion Core 4.13.1¶
已添加¶
在库映射表中添加了“Source Language”列
在 DetailedReport.docx 的 Pandas API 摘要表中添加了
Others作为新类别
更改¶
更新了 Python EWI
SPRKPY1058的文档。更新了 pandas EWI
PNDSPY1002的消息,以显示相关的 Pandas 元素。更新了我们创建 .csv 报告的方式,现在,在第二次运行后其会被覆盖。
已修复¶
修复了导致在输出中无法生成笔记本文件的错误。
修复了
pyspark.sql.conf.RuntimeConfig中的get和set方法的替换器,替换器现可匹配正确的全名。修复了查询标签版本不正确的问题。
修复了 UserDefined 软件包报告为 ThirdPartyLib 的问题。
\
Version 2.3.1 (Nov 14, 2024)¶
Application & CLI Version 2.3.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.12.0
Desktop App¶
已修复
修复了 --sql 选项中区分大小写的问题。
已移除
从 show-ac 消息中删除了平台名称。
Snowpark Conversion Core 4.12.0¶
已添加¶
新增对 Snowpark Python 1.23.0 和 1.24.0 的支持。
为
pyspark.sql.dataframe.DataFrame.writeTo函数添加了新的 EWI。现在,该函数的所有使用都将采用 EWI SPRKPY1087。
更改¶
将 Scala EWIs 的文档从
SPRKSCL1137更新为SPRKSCL1156,使其与标准化格式保持一致,确保了所有 EWIs 的一致性和清晰度。将 Scala EWIs 的文档从
SPRKSCL1117更新为SPRKSCL1136,使其与标准化格式保持一致,确保了所有 EWIs 的一致性和清晰度。更新了针对以下 EWIs 显示的消息:
SPRKPY1082
SPRKPY1083
将 Scala EWIs 的文档从
SPRKSCL1100更新为SPRKSCL1105,从SPRKSCL1108更新为SPRKSCL1116,从SPRKSCL1157更新为SPRKSCL1175,使其与标准化格式保持一致,确保了所有 EWIs 的一致性和清晰度。使用 EWI 将以下 PySpark 元素的映射状态从 NotSupported 更新为 Direct:
pyspark.sql.readwriter.DataFrameWriter.option=>snowflake.snowpark.DataFrameWriter.option:现在,该函数的所有使用都将采用 EWI SPRKPY1088pyspark.sql.readwriter.DataFrameWriter.options=>snowflake.snowpark.DataFrameWriter.options:现在,该函数的所有使用都将采用 EWI SPRKPY1089
将以下 PySpark 元素的映射状态从 Workaround 更新为 Rename:
pyspark.sql.readwriter.DataFrameWriter.partitionBy=>snowflake.snowpark.DataFrameWriter.partition_by
更新了 EWI 文档:SPRKSCL1000、SPRKSCL1001、SPRKSCL1002、SPRKSCL1100、SPRKSCL1101、SPRKSCL1102、SPRKSCL1103、SPRKSCL1104、SPRKSCL1105。
Removed¶
从转换状态中移除了
pyspark.sql.dataframe.DataFrameStatFunctions.writeTo,此元素已不再存在。
已弃用¶
已弃用以下 EWI 代码:
SPRKPY1081
SPRKPY1084
Version 2.3.0 (Oct 30, 2024)¶
应用程序和 CLI 版本 2.3.0¶
Snowpark Conversion Core 4.11.0
Snowpark Conversion Core 4.11.0¶
已添加¶
在
Issues.csv文件中添加了一个名为Url的新列,该列重定向到相应的 EWI 文档。为以下 Spark 元素添加了新的 EWIs:
[SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
[SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
[SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
[SPRKPY1085] pyspark.ml.feature.VectorAssembler
[SPRKPY1086] pyspark.ml.linalg.VectorUDT
新增 38 个 Pandas 元素:
pandas.core.frame.DataFrame.select
andas.core.frame.DataFrame.str
pandas.core.frame.DataFrame.str.replace
pandas.core.frame.DataFrame.str.upper
pandas.core.frame.DataFrame.to_list
pandas.core.frame.DataFrame.tolist
pandas.core.frame.DataFrame.unique
pandas.core.frame.DataFrame.values.tolist
pandas.core.frame.DataFrame.withColumn
pandas.core.groupby.generic._SeriesGroupByScalar
pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
pandas.core.indexes.datetimes.DatetimeIndex.year
pandas.core.series.Series.columns
pandas.core.tools.datetimes.to_datetime.date
pandas.core.tools.datetimes.to_datetime.dt.strftime
pandas.core.tools.datetimes.to_datetime.strftime
pandas.io.parsers.readers.TextFileReader.apply
pandas.io.parsers.readers.TextFileReader.astype
pandas.io.parsers.readers.TextFileReader.columns
pandas.io.parsers.readers.TextFileReader.copy
pandas.io.parsers.readers.TextFileReader.drop
pandas.io.parsers.readers.TextFileReader.drop_duplicates
pandas.io.parsers.readers.TextFileReader.fillna
pandas.io.parsers.readers.TextFileReader.groupby
pandas.io.parsers.readers.TextFileReader.head
pandas.io.parsers.readers.TextFileReader.iloc
pandas.io.parsers.readers.TextFileReader.isin
pandas.io.parsers.readers.TextFileReader.iterrows
pandas.io.parsers.readers.TextFileReader.loc
pandas.io.parsers.readers.TextFileReader.merge
pandas.io.parsers.readers.TextFileReader.rename
pandas.io.parsers.readers.TextFileReader.shape
pandas.io.parsers.readers.TextFileReader.to_csv
pandas.io.parsers.readers.TextFileReader.to_excel
pandas.io.parsers.readers.TextFileReader.unique
pandas.io.parsers.readers.TextFileReader.values
pandas.tseries.offsets
Version 2.2.3 (Oct 24, 2024)¶
Application Version 2.2.3¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.10.0
Desktop App¶
已修复¶
修复了导致 SMA 在 Windows 版本菜单栏中显示 SnowConvert 而非 Snowpark Migration Accelerator 标签的错误。
修复了对于 macOS 中的
.config目录和 Windows 中的AppData目录没有读写权限时,导致 SMA 崩溃的错误。
Command Line Interface¶
已更改
将 CLI 可执行文件名从
snowct重命名为sma。移除了源语言参数,因此您不需要再指定运行的是 Python 还是 Scala 评估/转换。
通过添加以下新实参扩展了 CLI 支持的命令行参数:
--enableJupyter|-j:该标志用于指示是否已启用从 Databricks 笔记本到 Jupyter 的转换。--sql|-f:在检测 SQL 命令时使用的数据库引擎语法。--customerEmail|-e:配置客户电子邮件地址。--customerCompany|-c:配置客户的公司。--projectName|-p:配置客户项目。
更新了部分文本,以体现应用程序的正确名称,确保所有消息的一致性和清晰度。
更新了应用程序使用条款。
更新并扩展了 CLI 文档,以体现最新功能、增强和更改。
更新了在继续执行 SMA 之前显示的文本,以作出改进
更新了 CLI,在提示用户确认时接受 “是” 作为有效实参。
指定实参
-y或--yes,允许 CLI 在不等待用户交互的情况下继续执行。更新了
--sql实参的帮助信息,以显示该实参预期应收到的值。
Snowpark Conversion Core 版本 4.10.0¶
已添加¶
为
pyspark.sql.readwriter.DataFrameWriter.partitionBy函数添加了新 EWI。现在,该函数的所有使用都将采用 EWI SPRKPY1081。在
ImportUsagesInventory.csv文件中添加了一个名为Technology的新列。
更改¶
更新了第三方库就绪度分数,同时考虑了
Unknown库。更新了
AssessmentFiles.zip文件,使其包含.json文件,而非.pam文件。改进了从 CSV 到 JSON 的转换机制,以提高清单处理性能。
改进了以下 EWIs 的文档:
SPRKPY1029
SPRKPY1054
SPRKPY1055
SPRKPY1063
SPRKPY1075
SPRKPY1076
将以下 Spark Scala 元素的映射状态从
Direct更新为Rename。org.apache.spark.sql.functions.shiftLeft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftRight=>com.snowflake.snowpark.functions.shiftright
将以下 Spark Scala 元素的映射状态从
Not Supported更新为Direct。org.apache.spark.sql.functions.shiftleft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftright=>com.snowflake.snowpark.functions.shiftright
已修复¶
修复了导致 SMA 错误地填充
ImportUsagesInventory.csv文件的Origin列的错误。修复了导致 SMA 在
ImportUsagesInventory.csv文件和DetailedReport.docx文件中未将导入的库io、json、logging和unittest归类为 Python 内置导入的错误。
Version 2.2.2 (Oct 11, 2024)¶
应用程序版本 2.2.2¶
功能更新包括:
Snowpark Conversion Core 4.8.0
Snowpark Conversion Core 版本 4.8.0¶
已添加¶
添加了
EwiCatalog.csv和 .md 文件来重新组织文档添加了
pyspark.sql.functions.lnDirect 的映射状态。为
pyspark.context.SparkContext.getOrCreate添加了转换请查看 EWI SPRKPY1080,以了解更多详情。
添加了对 SymbolTable 的改进,用于推断函数中参数的类型。
新增的 SymbolTable 支持静态方法,不会假设第一个参数是 self。
为缺失的 EWIs 添加了文档
SPRKHVSQL1005
SPRKHVSQL1006
SPRKSPSQL1005
SPRKSPSQL1006
SPRKSCL1002
SPRKSCL1170
SPRKSCL1171
SPRKPY1057
SPRKPY1058
SPRKPY1059
SPRKPY1060
SPRKPY1061
SPRKPY1064
SPRKPY1065
SPRKPY1066
SPRKPY1067
SPRKPY1069
SPRKPY1070
SPRKPY1077
SPRKPY1078
SPRKPY1079
SPRKPY1101
更改¶
更新了以下内容的映射状态:
pyspark.sql.functions.array_remove从NotSupported更新为Direct。
已修复¶
修复了“Detail Report”中的“Code File Sizing”表,排除了 .sql 和 .hql 文件,并在表中添加了“Extra Large”行。
修复了在
Python上将SparkSession定义为多行时,缺少update_query_tag的问题。修复了在
Scala上将SparkSession定义为多行时,缺少update_query_tag的问题。修复了某些存在解析错误的 SQL 语句中缺少 EWI
SPRKHVSQL1001的问题。修复了在字符串字面量中保留换行符值的问题
修复了在“File Type Summary”表中显示的代码总行数
修复了成功识别文件时“Parsing Score”显示为 0 的问题
修复了 Databricks Magic SQL 单元格清单中的 LOC 计数问题
Version 2.2.0 (Sep 26, 2024)¶
Application Version 2.2.0¶
功能更新包括:
Snowpark Conversion Core 4.6.0
Snowpark Conversion Core 版本 4.6.0¶
已添加¶
为
pyspark.sql.readwriter.DataFrameReader.parquet添加转换。在
pyspark.sql.readwriter.DataFrameReader.option是 Parquet 方法时,为其添加转换。
更改¶
更新了以下内容的映射状态:
pyspark.sql.types.StructType.fields从NotSupported更新到Direct。pyspark.sql.types.StructType.names从NotSupported更新到Direct。pyspark.context.SparkContext.setLogLevel从Workaround更新到Transformation。更多详情,请参阅 EWIs SPRKPY1078 和 SPRKPY1079
org.apache.spark.sql.functions.round从WorkAround更新到Direct。org.apache.spark.sql.functions.udf从NotDefined更新到Transformation。更多详情,请参阅 EWIs SPRKSCL1174 和 SPRKSCL1175
将以下 Spark 元素的映射状态从
DirectHelper更新为Direct:org.apache.spark.sql.functions.hexorg.apache.spark.sql.functions.unhexorg.apache.spark.sql.functions.shiftleftorg.apache.spark.sql.functions.shiftrightorg.apache.spark.sql.functions.reverseorg.apache.spark.sql.functions.isnullorg.apache.spark.sql.functions.unix_timestamporg.apache.spark.sql.functions.randnorg.apache.spark.sql.functions.signumorg.apache.spark.sql.functions.signorg.apache.spark.sql.functions.collect_listorg.apache.spark.sql.functions.log10org.apache.spark.sql.functions.log1porg.apache.spark.sql.functions.base64org.apache.spark.sql.functions.unbase64org.apache.spark.sql.functions.regexp_extractorg.apache.spark.sql.functions.exprorg.apache.spark.sql.functions.date_formatorg.apache.spark.sql.functions.descorg.apache.spark.sql.functions.ascorg.apache.spark.sql.functions.sizeorg.apache.spark.sql.functions.locateorg.apache.spark.sql.functions.ntile
已修复¶
修复了 Pandas Api 总数百分比中显示的值
修复了 DetailReport 中 ImportCalls 表的总百分比
已弃用¶
弃用了以下 EWI 代码:
SPRKSCL1115
Version 2.1.7 (Sep 12, 2024)¶
应用程序版本 2.1.7¶
功能更新包括:
Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2
Snowpark Conversion Core 版本 4.5.7¶
Hotfixed¶
修复了在没有使用数据时,在“Spark Usages Summaries”中添加总行数的问题
升级了 Python 程序集,现在 Version=
1.3.111解析多行实参中的尾随逗号
Snowpark Conversion Core 版本 4.5.2¶
已添加¶
为
pyspark.sql.readwriter.DataFrameReader.option添加了转换:在链来自 CSV 方法调用时。
在链来自 JSON 方法调用时。
为
pyspark.sql.readwriter.DataFrameReader.json添加了转换。
更改¶
对传递给 Python/Scala 函数的 SQL 字符串执行 SMA
在 Scala/Python 中创建 AST,以发出临时 SQL 单元
创建 SqlEmbeddedUsages.csv 清单
弃用 SqlStatementsInventroy.csv 和 SqlExtractionInventory.csv
无法处理 SQL 字面量时集成 EWI
创建新任务来处理嵌入 SQL 的代码
在 Python 中收集 SqlEmbeddedUsages.csv 清单的信息
在 Python 中将 SQL 转换后的代码替换为字面量
在实施测试用例之后对其进行更新
在 SqlEmbeddedUsages 清单中创建用于遥测的表和视图
在 Scala 中为 SqlEmbeddedUsages.csv 报告收集信息
在 Scala 中将 SQL 转换后的代码替换为字面量
检查嵌入式 SQL 报告的行号顺序
在
SqlFunctionsInfo.csv中填入了为 SparkSQL 和 HiveSQL 记录的 SQL 函数更新了以下各项的映射状态:
org.apache.spark.sql。SparkSession.sparkContext从NotSupported转型到转换。org.apache.spark.sql.Builder.config从NotSupported更新到Transformation。通过此新映射状态,SMA 将从源代码中删除该函数所有相关调用。
Version 2.1.6 (Sep 5, 2024)¶
应用程序版本 2.1.6¶
Snowpark Engines Core 版本 4.5.1 的修补程序更改
Spark Conversion Core 版本 4.5.1¶
修补程序
添加了一种机制,可在导出的 Databricks 笔记本中转换由 SMA 生成的临时 Databricks 笔记本
Version 2.1.5 (Aug 29, 2024)¶
应用程序版本 2.1.5¶
功能更新包括:
更新了 Spark Conversion Core:4.3.2
Spark Conversion Core 版本 4.3.2¶
已添加¶
添加了一种机制(通过装饰),用于获取笔记本单元格中识别出的元素的行和列
为 pyspark.sql.functions.from_json 添加了 EWI。
为 pyspark.sql.readwriter.DataFrameReader.csv 添加了转换。
为 Scala 文件启用了查询标签机制。
添加了代码分析分数及详细报告的额外链接。
在 InputFilesInventory.csv 中添加了名为 OriginFilePath 的一列
更改¶
将 pyspark.sql.functions.from_json 的映射状态从“Not Supported”更新为 Transformation。
将以下 Spark 元素的映射状态从“Workaround”更新为“Direct”:
org.apache.spark.sql.functions.countDistinct
org.apache.spark.sql.functions.max
org.apache.spark.sql.functions.min
org.apache.spark.sql.functions.mean
已弃用¶
已弃用以下 EWI 代码:
SPRKSCL1135
SPRKSCL1136
SPRKSCL1153
SPRKSCL1155
已修复¶
修复了导致 Spark API 分数计算不正确的错误。
修复了避免将 SQL 空文件或含注释文件复制到输出文件夹中的错误。
修复了 DetailedReport 中的一个错误,该错误导致笔记本统计数据 LOC 和单元格计数不准确。
Version 2.1.2 (Aug 14, 2024)¶
应用程序版本 2.1.2¶
功能更新包括:
更新了 Spark Conversion Core:4.2.0
Spark Conversion Core 版本 4.2.0¶
已添加¶
将技术列添加到 SparkUsagesInventory
添加了一个用于未定义的 SQL 元素的 EWI。
添加了 SqlFunctions 清单
收集 SqlFunctions 清单的信息
更改¶
引擎现在可以处理和打印部分地进行了解析的 Python 文件,而非保留原始文件而不做任何修改。
出现解析错误的 Python 笔记本单元格也会被处理和打印。
已修复¶
修复了
pandas.core.indexes.datetimes.DatetimeIndex.strftime被错误地报告的问题。修复了 SQL 就绪度分数与“SQL Usages by Support Status”之间不匹配的问题。
修复了导致 SMA 报告
pandas.core.series.Series.empty映射状态不正确的错误。修复了 DetailedReport.docx 中的“Spark API Usages Ready for Conversion”与 Assesment.json 中的 UsagesReadyForConversion 行之间不匹配的问题。
Version 2.1.1 (Aug 8, 2024)¶
应用程序版本 2.1.1¶
功能更新包括:
更新了 Spark Conversion Core:4.1.0
Spark Conversion Core 版本 4.1.0¶
已添加¶
在
AssessmentReport.json文件中添加了以下信息第三方库就绪度分数。
已确定的第三方库调用次数。
Snowpark 内支持的第三方库调用次数。
与第三方就绪度分数、Spark API 就绪度分数和 SQL 就绪度分数关联的颜色代码。
在 Spark 创建表中,对
SqlSimpleDataType进行了转换。添加了 direct 形式的
pyspark.sql.functions.get映射。添加了 direct 形式的
pyspark.sql.functions.to_varchar映射。作为统一后更改的一部分,此工具现在会在引擎中生成执行信息文件。
添加了
pyspark.sql.SparkSession.builder.appName的替换器。
更改¶
更新了以下 Spark 元素的映射状态
从“Not Supported”更新为“Direct”映射:
pyspark.sql.functions.signpyspark.sql.functions.signum
更改了笔记本单元格清单报告,以指明“Element”列中每个单元格的内容种类
添加了
SCALA_READINESS_SCORE列,该列报告的就绪度分数仅与 Scala 文件中对 Spark API 的引用有关。部分支持在
ALTER TABLE和ALTER VIEW中转换表属性在 Spark 创建表中,将
SqlSimpleDataType节点的转换状态从“Pending”更新为“Transformation”SMA 支持的 Snowpark Scala API 版本从
1.7.0更新为1.12.1:更新了以下内容的映射状态:
org.apache.spark.sql.SparkSession.getOrCreate从 Rename 更新为 Directorg.apache.spark.sql.functions.sum从“Workaround”更新为“Direct”
SMA 支持的 Snowpark Python API 版本从
1.15.0更新为1.20.0:更新了以下内容的映射状态:
pyspark.sql.functions.arrays_zip从“Not Supported”更新为“Direct”
更新了以下 Pandas 元素的映射状态:
Direct 映射:
pandas.core.frame.DataFrame.anypandas.core.frame.DataFrame.applymap
更新了以下 Pandas 元素的映射状态:
从“Not Supported”更新为“Direct”映射:
pandas.core.frame.DataFrame.groupbypandas.core.frame.DataFrame.indexpandas.core.frame.DataFrame.Tpandas.core.frame.DataFrame.to_dict
从“Not Supported”更新为“Rename”映射:
pandas.core.frame.DataFrame.map
更新了以下 Pandas 元素的映射状态:
Direct 映射:
pandas.core.frame.DataFrame.wherepandas.core.groupby.generic.SeriesGroupBy.aggpandas.core.groupby.generic.SeriesGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.aggpandas.core.groupby.generic.DataFrameGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.apply
“Not Supported”映射:
pandas.core.frame.DataFrame.to_parquetpandas.core.generic.NDFrame.to_csvpandas.core.generic.NDFrame.to_excelpandas.core.generic.NDFrame.to_sql
更新了以下 Pandas 元素的映射状态:
Direct 映射:
pandas.core.series.Series.emptypandas.core.series.Series.applypandas.core.reshape.tile.qcut
使用 EWI 的 Direct 映射:
pandas.core.series.Series.fillnapandas.core.series.Series.astypepandas.core.reshape.melt.meltpandas.core.reshape.tile.cutpandas.core.reshape.pivot.pivot_table
更新了以下 Pandas 元素的映射状态:
Direct 映射:
pandas.core.series.Series.dtpandas.core.series.Series.groupbypandas.core.series.Series.locpandas.core.series.Series.shapepandas.core.tools.datetimes.to_datetimepandas.io.excel._base.ExcelFile
“Not Supported”映射:
pandas.core.series.Series.dt.strftime
更新了以下 Pandas 元素的映射状态:
从“Not Supported”更新为“Direct”映射:
pandas.io.parquet.read_parquetpandas.io.parsers.readers.read_csv
更新了以下 Pandas 元素的映射状态:
从“Not Supported”更新为“Direct”映射:
pandas.io.pickle.read_picklepandas.io.sql.read_sqlpandas.io.sql.read_sql_query
更新了“了解 SQL 就绪度分数”的描述。
更新了
PyProgramCollector以收集包,并使用来自 Python 源代码的数据填充当前的包清单。将
pyspark.sql.SparkSession.builder.appName的映射状态从“Rename”更新为“Transformation”。删除了以下 Scala 集成测试:
AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentModeAssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScalaAssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
将
pandas.core.generic.NDFrame.shape的映射状态从“Not Supported”更新为“Direct”。将
pandas.core.series的映射状态从“Not Supported”更新为“Direct”。
已弃用¶
弃用了 EWI 代码
SPRKSCL1160,因为org.apache.spark.sql.functions.sum现在是“Direct”映射。
已修复¶
修复了在 Jupyter 笔记本单元格中不支持不带实参的 Custom Magics 的错误。
修复了在出现解析错误时,在 issues.csv 报告中错误地生成 EWIs 的问题。
修复了导致 SMA 无法将 Databricks 导出的笔记本作为 Databricks 笔记本处理的错误。
修复了在处理包对象内创建的声明类型名称冲突时,出现的堆栈溢出错误。
修复了对涉及泛型的复杂 lambda 类型名称的处理,例如,
def func[X,Y](f:(Map[Option[X], Y] => Map[Y, X]))...修复了一个 bug,该 bug 可导致 SMA 在尚未识别的 Pandas 元素中添加 PySpark EWI 代码,而非 Pandas EWI 代码。
修复了详细报告模板中的一个错字:将列从“Percentage of all Python Files”重命名为“Percentage of all files”。
修复了错误地报告
pandas.core.series.Series.shape的 bug。