Snowpark Migration Accelerator: Release Notes¶
Note that the release notes below are organized by release date. Version numbers for both the application and the conversion core will appear below.
Version 2.10.1 (Oct 23, 2025)¶
Application & CLI Version 2.10.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.72
Added¶
Added support for Snowpark Scala v1.17.0:
From Not Supported to Direct:
Dataset:
org.apache.spark.sql.Dataset.isEmpty→com.snowflake.snowpark.DataFrame.isEmpty
Row:
org.apache.spark.sql.Row.mkString→com.snowflake.snowpark.Row.mkString
StructType:
org.apache.spark.sql.types.StructType.fieldNames→com.snowflake.snowpark.types.StructType.fieldNames
From Not Supported to Rename:
Functions:
org.apache.spark.functions.flatten→com.snowflake.snowpark.functions.array_flatten
From Direct to Rename:
Functions:
org.apache.spark.functions.to_date→com.snowflake.snowpark.functions.try_to_dateorg.apache.spark.functions.to_timestamp→com.snowflake.snowpark.functions.try_to_timestamp
From Direct Helper to Rename:
Functions:
org.apache.spark.sql.functions.concat_ws→com.snowflake.snowpark.functions.concat_ws_ignore_nulls
From Not Defined to Direct:
Functions:
org.apache.spark.functions.try_to_timestamp→com.snowflake.snowpark.functions.try_to_timestampEmbedded SQL is now migrated when a SQL statement literal is assigned to a local variable.
Example: sqlStat = “SELECT colName FROM myTable” session.sql(sqlStat)
Embedded SQL is now supported for literal strings concatenations.
Example: session.sql(“SELECT colName “ + “FROM myTable”)
Changed¶
Updated the supported versions of Snowpark Python API and Snowpark Pandas API from 1.36.0 to 1.39.0.
Updated the mapping status for the following PySpark xpath functions from NotSupported to Direct with EWI SPRKPY1103:
pyspark.sql.functions.xpathpyspark.sql.functions.xpath_booleanpyspark.sql.functions.xpath_doublepyspark.sql.functions.xpath_floatpyspark.sql.functions.xpath_intpyspark.sql.functions.xpath_longpyspark.sql.functions.xpath_numberpyspark.sql.functions.xpath_shortpyspark.sql.functions.xpath_string
Updated the mapping status for the following PySpark elements from NotDefined to Direct:
pyspark.sql.functions.bit_and→snowflake.snowpark.functions.bitand_aggpyspark.sql.functions.bit_or→snowflake.snowpark.functions.bitor_aggpyspark.sql.functions.bit_xor→snowflake.snowpark.functions.bitxor_aggpyspark.sql.functions.getbit→snowflake.snowpark.functions.getbit
Updated the mapping status for the following Pandas elements from NotSupported to Direct:
pandas.core.indexes.base.Index→modin.pandas.Indexpandas.core.indexes.base.Index.get_level_values→modin.pandas.Index.get_level_values
Updated the mapping status for the following PySpark functions from NotSupported to Rename:
pyspark.sql.functions.now→snowflake.snowpark.functions.current_timestamp
Fixed¶
Fixed Scala not migrating imports when there’s a rename.
Example:
Source code:
.. code-block:: scala
package com.example.functions
import org.apache.spark.sql.functions.{to_timestamp, lit}
object ToTimeStampTest extends App { to_timestamp(lit(“sample”)) to_timestamp(lit(“sample”), “yyyy-MM-dd”) }Output code:
.. code-block:: scala
package com.example.functions
import com.snowflake.snowpark.functions.{try_to_timestamp, lit} import com.snowflake.snowpark_extensions.Extensions._ import com.snowflake.snowpark_extensions.Extensions.functions._
object ToTimeStampTest extends App { try_to_timestamp(lit(“sample”)) try_to_timestamp(lit(“sample”), “yyyy-MM-dd”) }
Version 2.10.0 (Sep 24, 2025)¶
Application & CLI Version 2.10.0¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.62
Added¶
Added functionality to migrate SQL embedded with Python format interpolation.
Added support for
DataFrame.selectandDataFrame.sorttransformations for greater data processing flexibility.
Changed¶
Bumped the supported versions of Snowpark Python API and Snowpark Pandas API to 1.36.0.
Updated the mapping status of
pandas.core.frame.DataFrame.boxplotfrom Not Supported to Direct.Updated the mapping status of
DataFrame.select,Dataset.select,DataFrame.sortandDataset.sortfrom Direct to Transformation.Snowpark Scala allows a sequence of columns to be passed directly to the select and sort functions, so this transformation changes all the usages such as
df.select(cols: _*)todf.select(cols)anddf.sort(cols: _*)todf.sort(cols).Bumped Python AST and Parser version to 149.1.9.
Updated the status to Direct for pandas functions:
pandas.core.frame.DataFrame.to_excelpandas.core.series.Series.to_excelpandas.io.feather_format.read_featherpandas.io.orc.read_orcpandas.io.stata.read_stata
Updated the status for
pyspark.sql.pandas.map_ops.PandasMapOpsMixin.mapInPandasto workaround using the EWI SPRKPY1102.
Fixed¶
Fixed issue that affected SqlEmbedded transformations when using chained method calls.
Fixed transformations involving PySqlExpr using the new PyLiteralSql to avoid losing Tails.
Resolved internal stability issues to improve tool robustness and reliability.
Version 2.7.7 (Aug 28, 2025)¶
Application & CLI Version 2.7.7¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.46
Added¶
Added new Pandas EWI documentation PNDSPY1011.
Added support to the following Pandas functions:
pandas.core.algorithms.unique
pandas.core.dtypes.missing.isna
pandas.core.dtypes.missing.isnull
pandas.core.dtypes.missing.notna
pandas.core.dtypes.missing.notnull
pandas.core.resample.Resampler.count
pandas.core.resample.Resampler.max
pandas.core.resample.Resampler.mean
pandas.core.resample.Resampler.median
pandas.core.resample.Resampler.min
pandas.core.resample.Resampler.size
pandas.core.resample.Resampler.sum
pandas.core.arrays.timedeltas.TimedeltaArray.total_seconds
pandas.core.series.Series.get
pandas.core.series.Series.to_frame
pandas.core.frame.DataFrame.assign
pandas.core.frame.DataFrame.get
pandas.core.frame.DataFrame.to_numpy
pandas.core.indexes.base.Index.is_unique
pandas.core.indexes.base.Index.has_duplicates
pandas.core.indexes.base.Index.shape
pandas.core.indexes.base.Index.array
pandas.core.indexes.base.Index.str
pandas.core.indexes.base.Index.equals
pandas.core.indexes.base.Index.identical
pandas.core.indexes.base.Index.unique
Added support to the following Spark Scala functions:
org.apache.spark.sql.functions.format_number
org.apache.spark.sql.functions.from_unixtime
org.apache.spark.sql.functions.instr
org.apache.spark.sql.functions.months_between
org.apache.spark.sql.functions.pow
org.apache.spark.sql.functions.to_unix_timestamp
org.apache.spark.sql.Row.getAs
Changed¶
Bumped the version of Snowpark Pandas API supported by the SMA to 1.33.0.
Bumped the version of Snowpark Scala API supported by the SMA to 1.16.0.
Updated the mapping status of pyspark.sql.group.GroupedData.pivot from Transformation to Direct.
Updated the mapping status of org.apache.spark.sql.Builder.master from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.
Updated the mapping status of org.apache.spark.sql.types.StructType.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.Row.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.SparkSession.stop from NotSupported to Rename. All the identified usages of this element are renamed to com.snowflake.snowpark.Session.close during code conversion.
Updated the mapping status of org.apache.spark.sql.DataFrame.unpersist and org.apache.spark.sql.Dataset.unpersist from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.
Fixed¶
Fixed continuation backslash on removed tailed functions.
Fix the LIBRARY_PREFIX column in the ConversionStatusLibraries.csv file to use the right identifier for scikit-learn library family (scikit-*).
Fixed bug not parsing multiline grouped operations.
Version 2.9.0 (Sep 09, 2025)¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.53
Added¶
The following mappings are now performed for
org.apache.spark.sql.Dataset[T]:org.apache.spark.sql.Dataset.unionis nowcom.snowflake.snowpark.DataFrame.unionAllorg.apache.spark.sql.Dataset.unionByNameis nowcom.snowflake.snowpark.DataFrame.unionAllByName
Added support for
org.apache.spark.sql.functions.broadcastas a transformation.
Changed¶
Increased the supported Snowpark Python API version for SMA from
1.27.0to1.33.0.The status for the
pyspark.sql.function.randnfunction has been updated to Direct.
Fixed¶
Resolved an issue where
org.apache.spark.SparkContext.parallelizewas not resolving and now supports it as a transformation.Fixed the
Dataset.persisttransformation to work with any type of Dataset, not justDataset[Row].
Version 2.7.6 (Jul 17, 2025)¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.30
Added¶
Adjusted mappings for spark.DataReader methods.
DataFrame.unionis nowDataFrame.unionAll.DataFrame.unionByNameis nowDataFrame.unionAllByName.Added multi-level artifact dependency columns in artifact inventory
Added new Pandas EWIs documentation, from
PNDSPY1005toPNDSPY1010.Added a specific EWI for
pandas.core.series.Series.apply.
Changed¶
Bumped the version of Snowpark Pandas API supported by the SMA from
1.27.0to1.30.0.
Fixed¶
Fixed an issue with missing values in the formula to get the SQL readiness score.
Fixed a bug that was causing some Pandas elements to have the default EWI message from PySpark.
Version 2.7.5 (Jul 2, 2025)¶
Application & CLI Version 2.7.5¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.19
Changed¶
Refactored Pandas Imports: Pandas imports now use `modin.pandas` instead of
snowflake.snowpark.modin.pandas.Improved `dbutils` and Magic Commands Transformation:
A new
sfutils.pyfile is now generated, and alldbutilsprefixes are replaced withsfutils.For Databricks (DBX) notebooks, an implicit import for
sfutilsis automatically added.The
sfutilsmodule simulates variousdbutilsmethods, including file system operations (dbutils.fs) via a defined Snowflake FileSystem (SFFS) stage, and handles notebook execution (dbutils.notebook.run) by transforming it toEXECUTE NOTEBOOKSQL functions.dbutils.notebook.exitis removed as it is not required in Snowflake.
Fixed¶
Updates in SnowConvert Reports: SnowConvert reports now include the CellId column when instances originate from SMA, and the FileName column displays the full path.
Updated Artifacts Dependency for SnowConvert Reports: The SMA’s artifact inventory report, which was previously impacted by the integration of SnowConvert, has been restored. This update enables the SMA tool to accurately capture and analyze Object References and Missing Object References directly from SnowConvert reports, thereby ensuring the correct retrieval of SQL dependencies for the inventory.
Version 2.7.4 (Jun 26, 2025)¶
Application & CLI Version 2.7.4¶
Desktop App
Added¶
Added telemetry improvements.
Fixed¶
Fix documentation links in conversion settings pop-up and Pandas EWIs.
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.16
Added¶
Transforming Spark XML to Snowpark
Databricks SQL option in the SQL source language
Transform JDBC read connections.
Changed¶
All the SnowConvert reports are copied to the backup Zip file.
The folder is renamed from
SqlReportstoSnowConvertReports.SqlFunctionsInventoryis moved to the folderReports.All the SnowConvert Reports are sent to Telemetry.
Fixed¶
Non-deterministic issue with SQL Readiness Score.
Fixed a false-positive critical result that made the desktop crash.
Fixed issue causing the Artifacts dependency report not to show the SQL objects.
Version 2.7.2 (Jun 10, 2025)¶
Application & CLI Version 2.7.2¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.2
Fixed¶
Addressed an issue with SMA execution on the latest Windows OS, as previously reported. This fix resolves the issues encountered in version 2.7.1.
Version 2.7.1 (Jun 9, 2025)¶
Application & CLI Version 2.7.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 8.0.1
Added¶
The Snowpark Migration Accelerator (SMA) now orchestrates SnowConvert (https://docs.snowconvert.com/sc/general/about) to process SQL found in user workloads, including embedded SQL in Python / Scala code, Notebook SQL cells, .sql files, and .hql files.
The SnowConvert now enhances the previous SMA capabilities:
Spark SQL (https://docs.snowconvert.com/sc/translation-references/spark-dbx)
A new folder in the Reports called SQL Reports contains the reports generated by SnowConvert.
Known Issues¶
The previous SMA version for SQL reports will appear empty for the following:
For
Reports/SqlElementsInventory.csv, partially covered by theReports/SqlReports/Elements.yyyymmdd.hhmmss.csv.For
Reports/SqlFunctionsInventory.csvrefer to the new location with the same name atReports/SqlReports/SqlFunctionsInventory.csv
The artifact dependency inventory:
In the
ArtifactDependencyInventorythe column for the SQL Object will appear empty
Version 2.6.10 (May 5, 2025)¶
Application & CLI Version 2.6.10¶
Included SMA Core Versions¶
Snowpark Conversion Core 7.4.0
Fixed¶
Fixed wrong values in the ‘checkpoints.json’ file.
The ‘sample’ value was without decimals (for integer values) and quotes.
The ‘entryPoint’ value had dots instead of slashes and was missing the file extension.
Updated the default value to TRUE for the setting ‘Convert DBX notebooks to Snowflake notebooks’
Version 2.6.8 (Apr 28, 2025)¶
Application & CLI Version 2.6.8¶
Desktop App¶
Added checkpoints execution settings mechanism recognition.
Added a mechanism to collect DBX magic commands into DbxElementsInventory.csv
Added ‘checkpoints.json’ generation into the input directory.
Added a new EWI for all not supported magic command.
Added the collection of dbutils into DbxElementsInventory.csv from scala source notebooks
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.53
Changed¶
Updates made to handle transformations from DBX Scala elements to Jupyter Python elements, and to comment the entire code from the cell.
Updates made to handle transformations from dbutils.notebook.run and “r” commands, for the last one, also comment out the entire code from the cell.
Updated the name and the letter of the key to make the conversion of the notebook files.
Fixed¶
Fixed the bug that was causing the transformation of DBX notebooks into .ipynb files to have the wrong format.
Fixed the bug that was causing .py DBX notebooks to not be transformable into .ipynb files.
Fixed a bug that was causing comments to be missing in the output code of DBX notebooks.
Fixed a bug that was causing raw Scala files to be converted into ipynb files.
Version 2.6.7 (Apr 21, 2025)¶
Application & CLI Version 2.6.7¶
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.42
Changed¶
Updated DataFramesInventory to fill EntryPoints column
Version 2.6.6 (Apr 7, 2025)¶
Application & CLI Version 2.6.6¶
Desktop App¶
Added¶
Update DBx EWI link in the UI results page
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.39
Added¶
Added Execution Flow inventory generation.
Added implicit session setup in every DBx notebook transformation
Changed¶
Renamed the DbUtilsUsagesInventory.csv to DbxElementsInventory.csv
Fixed¶
Fixed a bug that caused a Parsing error when a backslash came after a type hint.
Fixed relative imports that do not start with a dot and relative imports with a star.
Version 2.6.5 (Mar 27, 2025)¶
Application & CLI Version 2.6.5¶
Desktop App¶
Added¶
Added a new conversion setting toggle to enable or disable Sma-Checkpoints feature.
Fix report issue to not crash when post api returns 500
Included SMA Core Versions¶
Snowpark Conversion Core 7.2.26
Added¶
Added generation of the checkpoints.json file into the output folder based on the DataFramesInventory.csv.
Added “disableCheckpoints” flag into the CLI commands and additional parameters of the code processor.
Added a new replacer for Python to transform the dbutils.notebook.run node.
Added new replacers to transform the magic %run command.
Added new replacers (Python and Scala) to remove the dbutils.notebook.exit node.
Added Location column to artifacts inventory.
Changed¶
Refactored the normalized directory separator used in some parts of the solution.
Centralized the DBC extraction working folder name handling.
Updated Snowpark and Pandas version to v1.27.0
Updated the artifacts inventory columns to:
Name -> Dependency
File -> FileId
Status -> Status_detail
Added new column to the artifacts inventory:
Success
Fixed¶
Dataframes inventory was not being uploaded to the stage correctly.
Version 2.6.4 (Mar 12, 2025)¶
Application & CLI Version 2.6.4¶
Included SMA Core Versions ¶
Snowpark Conversion Core 7.2.0
Added ¶
An Artifact Dependency Inventory
A replacer and EWI for pyspark.sql.types.StructType.fieldNames method to snowflake.snowpark.types.StructType.fieldNames attribute.
The following PySpark functions with the status:
Direct Status
pyspark.sql.functions.bitmap_bit_positionpyspark.sql.functions.bitmap_bucket_numberpyspark.sql.functions.bitmap_construct_aggpyspark.sql.functions.equal_nullpyspark.sql.functions.ifnullpyspark.sql.functions.localtimestamppyspark.sql.functions.max_bypyspark.sql.functions.min_bypyspark.sql.functions.nvlpyspark.sql.functions.regr_avgxpyspark.sql.functions.regr_avgypyspark.sql.functions.regr_countpyspark.sql.functions.regr_interceptpyspark.sql.functions.regr_slopepyspark.sql.functions.regr_sxxpyspark.sql.functions.regr_sxypyspark.sql.functions.regr
NotSupported
pyspark.sql.functions.map_contains_keypyspark.sql.functions.positionpyspark.sql.functions.regr_r2pyspark.sql.functions.try_to_binary
The following Pandas functions with status
pandas.core.series.Series.str.ljustpandas.core.series.Series.str.centerpandas.core.series.Series.str.padpandas.core.series.Series.str.rjust
Update the following Pyspark functions with the status
From WorkAround to Direct
pyspark.sql.functions.acoshpyspark.sql.functions.asinhpyspark.sql.functions.atanhpyspark.sql.functions.instrpyspark.sql.functions.log10pyspark.sql.functions.log1ppyspark.sql.functions.log2
From NotSupported to Direct
pyspark.sql.functions.bit_lengthpyspark.sql.functions.cbrtpyspark.sql.functions.nth_valuepyspark.sql.functions.octet_lengthpyspark.sql.functions.base64pyspark.sql.functions.unbase64
Updated the folloing Pandas functions with the status
From NotSupported to Direct
pandas.core.frame.DataFrame.poppandas.core.series.Series.betweenpandas.core.series.Series.pop
Version 2.6.3 (Mar 6, 2025)¶
Application & CLI Version 2.6.3¶
Included SMA Core Versions ¶
Snowpark Conversion Core 7.1.13
Added ¶
Added csv generator class for new inventory creation.
Added “full_name” column to import usages inventory.
Added transformation from pyspark.sql.functions.concat_ws to snowflake.snowpark.functions._concat_ws_ignore_nulls.
Added logic for generation of checkpoints.json.
Added the inventories:
DataFramesInventory.csv.
CheckpointsInventory.csv
Version 2.6.0 (Feb 21, 2025)¶
Application & CLI Version 2.6.0¶
Desktop App ¶
Updated the licensing agreement, acceptance is required.
Included SMA Core Versions¶
Snowpark Conversion Core 7.1.2
Added
Updated the mapping status for the following PySpark elements, from NotSupported to Direct
pyspark.sql.types.ArrayType.jsonpyspark.sql.types.ArrayType.jsonValuepyspark.sql.types.ArrayType.simpleStringpyspark.sql.types.ArrayType.typeNamepyspark.sql.types.AtomicType.jsonpyspark.sql.types.AtomicType.jsonValuepyspark.sql.types.AtomicType.simpleStringpyspark.sql.types.AtomicType.typeNamepyspark.sql.types.BinaryType.jsonpyspark.sql.types.BinaryType.jsonValuepyspark.sql.types.BinaryType.simpleStringpyspark.sql.types.BinaryType.typeNamepyspark.sql.types.BooleanType.jsonpyspark.sql.types.BooleanType.jsonValuepyspark.sql.types.BooleanType.simpleStringpyspark.sql.types.BooleanType.typeNamepyspark.sql.types.ByteType.jsonpyspark.sql.types.ByteType.jsonValuepyspark.sql.types.ByteType.simpleStringpyspark.sql.types.ByteType.typeNamepyspark.sql.types.DecimalType.jsonpyspark.sql.types.DecimalType.jsonValuepyspark.sql.types.DecimalType.simpleStringpyspark.sql.types.DecimalType.typeNamepyspark.sql.types.DoubleType.jsonpyspark.sql.types.DoubleType.jsonValuepyspark.sql.types.DoubleType.simpleStringpyspark.sql.types.DoubleType.typeNamepyspark.sql.types.FloatType.jsonpyspark.sql.types.FloatType.jsonValuepyspark.sql.types.FloatType.simpleStringpyspark.sql.types.FloatType.typeNamepyspark.sql.types.FractionalType.jsonpyspark.sql.types.FractionalType.jsonValuepyspark.sql.types.FractionalType.simpleStringpyspark.sql.types.FractionalType.typeNamepyspark.sql.types.IntegerType.jsonpyspark.sql.types.IntegerType.jsonValuepyspark.sql.types.IntegerType.simpleStringpyspark.sql.types.IntegerType.typeNamepyspark.sql.types.IntegralType.jsonpyspark.sql.types.IntegralType.jsonValuepyspark.sql.types.IntegralType.simpleStringpyspark.sql.types.IntegralType.typeNamepyspark.sql.types.LongType.jsonpyspark.sql.types.LongType.jsonValuepyspark.sql.types.LongType.simpleStringpyspark.sql.types.LongType.typeNamepyspark.sql.types.MapType.jsonpyspark.sql.types.MapType.jsonValuepyspark.sql.types.MapType.simpleStringpyspark.sql.types.MapType.typeNamepyspark.sql.types.NullType.jsonpyspark.sql.types.NullType.jsonValuepyspark.sql.types.NullType.simpleStringpyspark.sql.types.NullType.typeNamepyspark.sql.types.NumericType.jsonpyspark.sql.types.NumericType.jsonValuepyspark.sql.types.NumericType.simpleStringpyspark.sql.types.NumericType.typeNamepyspark.sql.types.ShortType.jsonpyspark.sql.types.ShortType.jsonValuepyspark.sql.types.ShortType.simpleStringpyspark.sql.types.ShortType.typeNamepyspark.sql.types.StringType.jsonpyspark.sql.types.StringType.jsonValuepyspark.sql.types.StringType.simpleStringpyspark.sql.types.StringType.typeNamepyspark.sql.types.StructType.jsonpyspark.sql.types.StructType.jsonValuepyspark.sql.types.StructType.simpleStringpyspark.sql.types.StructType.typeNamepyspark.sql.types.TimestampType.jsonpyspark.sql.types.TimestampType.jsonValuepyspark.sql.types.TimestampType.simpleStringpyspark.sql.types.TimestampType.typeNamepyspark.sql.types.StructField.simpleStringpyspark.sql.types.StructField.typeNamepyspark.sql.types.StructField.jsonpyspark.sql.types.StructField.jsonValuepyspark.sql.types.DataType.jsonpyspark.sql.types.DataType.jsonValuepyspark.sql.types.DataType.simpleStringpyspark.sql.types.DataType.typeNamepyspark.sql.session.SparkSession.getActiveSessionpyspark.sql.session.SparkSession.versionpandas.io.html.read_htmlpandas.io.json._normalize.json_normalizepyspark.sql.types.ArrayType.fromJsonpyspark.sql.types.MapType.fromJsonpyspark.sql.types.StructField.fromJsonpyspark.sql.types.StructType.fromJsonpandas.core.groupby.generic.DataFrameGroupBy.pct_changepandas.core.groupby.generic.SeriesGroupBy.pct_change
Updated the mapping status for the following Pandas elements, from NotSupported to Direct
pandas.io.html.read_htmlpandas.io.json._normalize.json_normalizepandas.core.groupby.generic.DataFrameGroupBy.pct_changepandas.core.groupby.generic.SeriesGroupBy.pct_change
Updated the mapping status for the following PySpark elements, from Rename to Direct
pyspark.sql.functions.collect_listpyspark.sql.functions.size
Fixed ¶
Standardized the format of the version number in the inventories.
Version 2.5.2 (Feb 5, 2025)¶
Hotfix: Application & CLI Version 2.5.2¶
Desktop App¶
Fixed an issue when converting in the sample project option.
Included SMA Core Versions¶
Snowpark Conversion Core 5.3.0
Version 2.5.1 (Feb 4, 2025)¶
Application & CLI Version 2.5.1¶
Desktop App¶
Added a new modal when the user does not have write permission.
Updated the licensing aggrement, acceptance is required.
CLI¶
Fixed the year in the CLI screen when showing “–version” or “-v”
Included SMA Core Versions included-sma-core-versions¶
Snowpark Conversion Core 5.3.0
Added¶
Added the following Python Third-Party libraries with Direct status:
about-timeaffinegapaiohappyeyeballsalibi-detectalive-progressallure-nose2allure-robotframeworkanaconda-cloud-clianaconda-mirrorastropy-iers-dataasynchasyncsshautotsautovimlaws-msk-iam-sasl-signer-pythonazure-functionsbackports.tarfileblasbottlebsoncairocapnprotocaptumcategorical-distancecensusclickhouse-driverclustergramcmaconda-anaconda-telemetryconfigspacecpp-expecteddask-exprdata-science-utilsdatabricks-sdkdatetime-distancedb-dtypesdedupededupe-variable-datetimededupe_lehvenshtein_searchdedupe_levenshtein_searchdiff-coverdiptestdmglibdocstring_parserdoublemetaphonedspy-aieconmlemceeemojienvironseth-abieth-hasheth-typingeth-utilsexpatfiletypefitterflask-corsfpdf2frozendictgcabgeojsongettextglib-toolsgoogle-adsgoogle-ai-generativelanguagegoogle-api-python-clientgoogle-auth-httplib2google-cloud-bigquerygoogle-cloud-bigquery-coregoogle-cloud-bigquery-storagegoogle-cloud-bigquery-storage-coregoogle-cloud-resource-managergoogle-generativeaigooglemapsgraphemegraphenegraphql-relaygravisgreykitegrpc-google-iam-v1harfbuzzhatch-fancy-pypi-readmehaversinehiclasshicolor-icon-themehigheredhmmlearnholidays-exthttplib2icuimbalanced-ensembleimmutabledictimportlib-metadataimportlib-resourcesinquirerpyiterative-telemetryjaraco.contextjaraco.testjiterjiwerjoserfcjsoncppjsonpathjsonpath-ngjsonpath-pythonkagglehubkeplerglkt-legacylangchain-communitylangchain-experimentallangchain-snowflakelangchain-text-splitterslibabseillibflaclibgfortran-nglibgfortran5libgliblibgomplibgrpclibgsflibmagiclibogglibopenblaslibpostallibprotobuflibsentencepiecelibsndfilelibstdcxx-nglibtheoralibtifflibvorbislibwebplightweight-mmmlitestarlitestar-with-annotated-typeslitestar-with-attrslitestar-with-cryptographylitestar-with-jinjalitestar-with-jwtlitestar-with-prometheuslitestar-with-structloglunarcalendar-extmatplotlib-vennmetricksmimesismodin-raymomepympg123msgspecmsgspec-tomlmsgspec-yamlmsitoolsmultipartnamexnbconvert-allnbconvert-corenbconvert-pandocnlohmann_jsonnumba-cudanumpyrooffice365-rest-python-clientopenapi-pydanticopentelemetry-distroopentelemetry-instrumentationopentelemetry-instrumentation-system-metricsoptreeosmnxpathlibpdf2imagepfzypgpyplumbumpm4pypolarspolyfactorypoppler-cpppostalpre-commitprompt-toolkitpropcachepy-partiql-parserpy_stringmatchingpyatlanpyfakefspyfhelpyhacrf-datamadepyicebergpykrb5pylbfgspymilvuspymoopynisherpyomopypdfpypdf-with-cryptopypdf-with-fullpypdf-with-imagepypngpyprindpyrfrpysoundfilepytest-codspeedpytest-triopython-barcodepython-boxpython-docxpython-gssapipython-iso639python-magicpython-pandocpython-zstdpyucapyvinecopulibpyxirrqrcoderai-sdkray-clientray-observabilityreadlinerich-clickrouge-scoreruffscikit-criteriascikit-mobilitysentencepiece-pythonsentencepiece-spmsetuptools-markdownsetuptools-scmsetuptools-scm-git-archiveshareplumsimdjsonsimplecosinesis-extrasslack-sdksmacsnowflake-sqlalchemysnowflake_legacysocrata-pyspdlogsphinxcontrib-imagessphinxcontrib-jquerysphinxcontrib-youtubesplunk-opentelemetrysqlfluffsquarifyst-themestatisticsstreamlit-antd-componentsstreamlit-condition-treestreamlit-echartsstreamlit-feedbackstreamlit-keplerglstreamlit-mermaidstreamlit-navigation-barstreamlit-option-menustrictyamlstringdistsybiltensorflow-cputensorflow-texttiledb-ptorchaudiotorchevaltrio-websockettrulens-connectors-snowflaketrulens-coretrulens-dashboardtrulens-feedbacktrulens-otel-semconvtrulens-providers-cortextsdownsampletypingtyping-extensionstyping_extensionsunittest-xml-reportinguritemplateusuuid6wfdbwsprotozlibzope.index
Added the following Python BuiltIn libraries with Direct status:
aifcarrayastasynchatasyncioasyncoreatexitaudioopbase64bdbbinasciibitsectbuiltinsbz2calendarcgicgitbchunkcmathcmdcodecodecscodeopcolorsyscompileallconcurrentcontextlibcontextvarscopycopyregcprofilecryptcsvctypescursesdbmdifflibdisdistutilsdoctestemailensurepipenumerrnofaulthandlerfcntlfilecmpfileinputfnmatchfractionsftplibfunctoolsgcgetoptgetpassgettextgraphlibgrpgziphashlibheapqhmachtmlhttpidlelibimaplibimghdrimpimportlibinspectipaddressitertoolskeywordlinecachelocalelzmamailboxmailcapmarshalmathmimetypesmmapmodulefindermsilibmultiprocessingnetrcnisnntplibnumbersoperatoroptparseossaudiodevpdbpicklepickletoolspipespkgutilplatformplistlibpoplibposixpprintprofilepstatsptypwdpy_compilepyclbrpydocqueuequoprirandomrereprlibresourcerlcompleterrunpyschedsecretsselectselectorsshelveshlexsignalsitesitecustomizesmtpdsmtplibsndhdrsocketsocketserverspwdsqlite3sslstatstringstringprepstructsubprocesssunausymtablesysconfigsyslogtabnannytarfiletelnetlibtempfiletermiostesttextwrapthreadingtimeittkintertokentokenizetomllibtracetracebacktracemallocttyturtleturtledemotypesunicodedataurllibuuuuidvenvwarningswaveweakrefwebbrowserwsgirefxdrlibxmlxmlrpczipappzipfilezipimportzoneinfo
Added the following Python BuiltIn libraries with NotSupported status:
msvcrtwinregwinsound
Changed¶
Update .NET version to v9.0.0.
Improved EWI SPRKPY1068.
Bumped the version of Snowpark Python API supported by the SMA from 1.24.0 to 1.25.0.
Updated the detailed report template, now has the Snowpark version for Pandas.
Changed the following libraries from ThirdPartyLib to BuiltIn.
configparserdataclassespathlibreadlinestatisticszlib
Updated the mapping status for the following Pandas elements, from Direct to Partial:
pandas.core.frame.DataFrame.addpandas.core.frame.DataFrame.aggregatepandas.core.frame.DataFrame.allpandas.core.frame.DataFrame.applypandas.core.frame.DataFrame.astypepandas.core.frame.DataFrame.cumsumpandas.core.frame.DataFrame.divpandas.core.frame.DataFrame.dropnapandas.core.frame.DataFrame.eqpandas.core.frame.DataFrame.ffillpandas.core.frame.DataFrame.fillnapandas.core.frame.DataFrame.floordivpandas.core.frame.DataFrame.gepandas.core.frame.DataFrame.groupbypandas.core.frame.DataFrame.gtpandas.core.frame.DataFrame.idxmaxpandas.core.frame.DataFrame.idxminpandas.core.frame.DataFrame.infpandas.core.frame.DataFrame.joinpandas.core.frame.DataFrame.lepandas.core.frame.DataFrame.locpandas.core.frame.DataFrame.ltpandas.core.frame.DataFrame.maskpandas.core.frame.DataFrame.mergepandas.core.frame.DataFrame.modpandas.core.frame.DataFrame.mulpandas.core.frame.DataFrame.nepandas.core.frame.DataFrame.nuniquepandas.core.frame.DataFrame.pivot_tablepandas.core.frame.DataFrame.powpandas.core.frame.DataFrame.raddpandas.core.frame.DataFrame.rankpandas.core.frame.DataFrame.rdivpandas.core.frame.DataFrame.renamepandas.core.frame.DataFrame.replacepandas.core.frame.DataFrame.resamplepandas.core.frame.DataFrame.rfloordivpandas.core.frame.DataFrame.rmodpandas.core.frame.DataFrame.rmulpandas.core.frame.DataFrame.rollingpandas.core.frame.DataFrame.roundpandas.core.frame.DataFrame.rpowpandas.core.frame.DataFrame.rsubpandas.core.frame.DataFrame.rtruedivpandas.core.frame.DataFrame.shiftpandas.core.frame.DataFrame.skewpandas.core.frame.DataFrame.sort_indexpandas.core.frame.DataFrame.sort_valuespandas.core.frame.DataFrame.subpandas.core.frame.DataFrame.to_dictpandas.core.frame.DataFrame.transformpandas.core.frame.DataFrame.transposepandas.core.frame.DataFrame.truedivpandas.core.frame.DataFrame.varpandas.core.indexes.datetimes.date_rangepandas.core.reshape.concat.concatpandas.core.reshape.melt.meltpandas.core.reshape.merge.mergepandas.core.reshape.pivot.pivot_tablepandas.core.reshape.tile.cutpandas.core.series.Series.addpandas.core.series.Series.aggregatepandas.core.series.Series.allpandas.core.series.Series.anypandas.core.series.Series.cumsumpandas.core.series.Series.divpandas.core.series.Series.dropnapandas.core.series.Series.eqpandas.core.series.Series.ffillpandas.core.series.Series.fillnapandas.core.series.Series.floordivpandas.core.series.Series.gepandas.core.series.Series.gtpandas.core.series.Series.ltpandas.core.series.Series.maskpandas.core.series.Series.modpandas.core.series.Series.mulpandas.core.series.Series.multiplypandas.core.series.Series.nepandas.core.series.Series.powpandas.core.series.Series.quantilepandas.core.series.Series.raddpandas.core.series.Series.rankpandas.core.series.Series.rdivpandas.core.series.Series.renamepandas.core.series.Series.replacepandas.core.series.Series.resamplepandas.core.series.Series.rfloordivpandas.core.series.Series.rmodpandas.core.series.Series.rmulpandas.core.series.Series.rollingpandas.core.series.Series.rpowpandas.core.series.Series.rsubpandas.core.series.Series.rtruedivpandas.core.series.Series.samplepandas.core.series.Series.shiftpandas.core.series.Series.skewpandas.core.series.Series.sort_indexpandas.core.series.Series.sort_valuespandas.core.series.Series.stdpandas.core.series.Series.subpandas.core.series.Series.subtractpandas.core.series.Series.truedivpandas.core.series.Series.value_countspandas.core.series.Series.varpandas.core.series.Series.wherepandas.core.tools.numeric.to_numeric
Updated the mapping status for the following Pandas elements, from NotSupported to Direct:
pandas.core.frame.DataFrame.attrspandas.core.indexes.base.Index.to_numpypandas.core.series.Series.str.lenpandas.io.html.read_htmlpandas.io.xml.read_xmlpandas.core.indexes.datetimes.DatetimeIndex.meanpandas.core.resample.Resampler.indicespandas.core.resample.Resampler.nuniquepandas.core.series.Series.itemspandas.core.tools.datetimes.to_datetimepandas.io.sas.sasreader.read_saspandas.core.frame.DataFrame.attrspandas.core.frame.DataFrame.stylepandas.core.frame.DataFrame.itemspandas.core.groupby.generic.DataFrameGroupBy.headpandas.core.groupby.generic.DataFrameGroupBy.medianpandas.core.groupby.generic.DataFrameGroupBy.minpandas.core.groupby.generic.DataFrameGroupBy.nuniquepandas.core.groupby.generic.DataFrameGroupBy.tailpandas.core.indexes.base.Index.is_booleanpandas.core.indexes.base.Index.is_floatingpandas.core.indexes.base.Index.is_integerpandas.core.indexes.base.Index.is_monotonic_decreasingpandas.core.indexes.base.Index.is_monotonic_increasingpandas.core.indexes.base.Index.is_numericpandas.core.indexes.base.Index.is_objectpandas.core.indexes.base.Index.maxpandas.core.indexes.base.Index.minpandas.core.indexes.base.Index.namepandas.core.indexes.base.Index.namespandas.core.indexes.base.Index.renamepandas.core.indexes.base.Index.set_namespandas.core.indexes.datetimes.DatetimeIndex.day_namepandas.core.indexes.datetimes.DatetimeIndex.month_namepandas.core.indexes.datetimes.DatetimeIndex.timepandas.core.indexes.timedeltas.TimedeltaIndex.ceilpandas.core.indexes.timedeltas.TimedeltaIndex.dayspandas.core.indexes.timedeltas.TimedeltaIndex.floorpandas.core.indexes.timedeltas.TimedeltaIndex.microsecondspandas.core.indexes.timedeltas.TimedeltaIndex.nanosecondspandas.core.indexes.timedeltas.TimedeltaIndex.roundpandas.core.indexes.timedeltas.TimedeltaIndex.secondspandas.core.reshape.pivot.crosstabpandas.core.series.Series.dt.roundpandas.core.series.Series.dt.timepandas.core.series.Series.dt.weekdaypandas.core.series.Series.is_monotonic_decreasingpandas.core.series.Series.is_monotonic_increasing
Updated the mapping status for the following Pandas elements, from NotSupported to Partial:
pandas.core.frame.DataFrame.alignpandas.core.series.Series.alignpandas.core.frame.DataFrame.tz_convertpandas.core.frame.DataFrame.tz_localizepandas.core.groupby.generic.DataFrameGroupBy.fillnapandas.core.groupby.generic.SeriesGroupBy.fillnapandas.core.indexes.datetimes.bdate_rangepandas.core.indexes.datetimes.DatetimeIndex.stdpandas.core.indexes.timedeltas.TimedeltaIndex.meanpandas.core.resample.Resampler.asfreqpandas.core.resample.Resampler.quantilepandas.core.series.Series.mappandas.core.series.Series.tz_convertpandas.core.series.Series.tz_localizepandas.core.window.expanding.Expanding.countpandas.core.window.rolling.Rolling.countpandas.core.groupby.generic.DataFrameGroupBy.aggregatepandas.core.groupby.generic.SeriesGroupBy.aggregatepandas.core.frame.DataFrame.applymappandas.core.series.Series.applypandas.core.groupby.generic.DataFrameGroupBy.bfillpandas.core.groupby.generic.DataFrameGroupBy.ffillpandas.core.groupby.generic.SeriesGroupBy.bfillpandas.core.groupby.generic.SeriesGroupBy.ffillpandas.core.frame.DataFrame.backfillpandas.core.frame.DataFrame.bfillpandas.core.frame.DataFrame.comparepandas.core.frame.DataFrame.unstackpandas.core.frame.DataFrame.asfreqpandas.core.series.Series.backfillpandas.core.series.Series.bfillpandas.core.series.Series.comparepandas.core.series.Series.unstackpandas.core.series.Series.asfreqpandas.core.series.Series.argmaxpandas.core.series.Series.argminpandas.core.indexes.accessors.CombinedDatetimelikeProperties.microsecondpandas.core.indexes.accessors.CombinedDatetimelikeProperties.nanosecondpandas.core.indexes.accessors.CombinedDatetimelikeProperties.day_namepandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_namepandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_startpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_endpandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_leap_yearpandas.core.indexes.accessors.CombinedDatetimelikeProperties.floorpandas.core.indexes.accessors.CombinedDatetimelikeProperties.ceilpandas.core.groupby.generic.DataFrameGroupBy.idxmaxpandas.core.groupby.generic.DataFrameGroupBy.idxminpandas.core.groupby.generic.DataFrameGroupBy.stdpandas.core.indexes.timedeltas.TimedeltaIndex.meanpandas.core.tools.timedeltas.to_timedelta
Known Issue¶
This version includes an issue when converting the sample project will not work on this version, it will be fixed on the next release
Version 2.4.3 (Jan 9, 2025)¶
Application & CLI Version 2.4.3¶
Desktop App¶
Added link to the troubleshooting guide in the crash report modal.
Included SMA Core Versions¶
Snowpark Conversion Core 4.15.0
Added¶
Added the following PySpark elements to ConversionStatusPySpark.csv file as
NotSupported:pyspark.sql.streaming.readwriter.DataStreamReader.tablepyspark.sql.streaming.readwriter.DataStreamReader.schemapyspark.sql.streaming.readwriter.DataStreamReader.optionspyspark.sql.streaming.readwriter.DataStreamReader.optionpyspark.sql.streaming.readwriter.DataStreamReader.loadpyspark.sql.streaming.readwriter.DataStreamReader.formatpyspark.sql.streaming.query.StreamingQuery.awaitTerminationpyspark.sql.streaming.readwriter.DataStreamWriter.partitionBypyspark.sql.streaming.readwriter.DataStreamWriter.toTablepyspark.sql.streaming.readwriter.DataStreamWriter.triggerpyspark.sql.streaming.readwriter.DataStreamWriter.queryNamepyspark.sql.streaming.readwriter.DataStreamWriter.outputModepyspark.sql.streaming.readwriter.DataStreamWriter.formatpyspark.sql.streaming.readwriter.DataStreamWriter.optionpyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatchpyspark.sql.streaming.readwriter.DataStreamWriter.start
Changed¶
Updated Hive SQL EWIs format.
SPRKHVSQL1001
SPRKHVSQL1002
SPRKHVSQL1003
SPRKHVSQL1004
SPRKHVSQL1005
SPRKHVSQL1006
Updated Spark SQL EWIs format.
SPRKSPSQL1001
SPRKSPSQL1002
SPRKSPSQL1003
SPRKSPSQL1004
SPRKSPSQL1005
SPRKSPSQL1006
Fixed¶
Fixed a bug that was causing some PySpark elements not identified by the tool.
Fixed the mismatch in the ThirdParty identified calls and the ThirdParty import Calls number.
Version 2.4.2 (Dec 13, 2024)¶
Application & CLI Version 2.4.2¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.14.0
Added added¶
Added the following Spark elements to ConversionStatusPySpark.csv:
pyspark.broadcast.Broadcast.valuepyspark.conf.SparkConf.getAllpyspark.conf.SparkConf.setAllpyspark.conf.SparkConf.setMasterpyspark.context.SparkContext.addFilepyspark.context.SparkContext.addPyFilepyspark.context.SparkContext.binaryFilespyspark.context.SparkContext.setSystemPropertypyspark.context.SparkContext.versionpyspark.files.SparkFilespyspark.files.SparkFiles.getpyspark.rdd.RDD.countpyspark.rdd.RDD.distinctpyspark.rdd.RDD.reduceByKeypyspark.rdd.RDD.saveAsTextFilepyspark.rdd.RDD.takepyspark.rdd.RDD.zipWithIndexpyspark.sql.context.SQLContext.udfpyspark.sql.types.StructType.simpleString
Changed¶
Updated the documentation of the Pandas EWIs,
PNDSPY1001,PNDSPY1002andPNDSPY1003SPRKSCL1137to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the following Scala EWIs:
SPRKSCL1106andSPRKSCL1107. To be aligned with a standardized format, ensuring consistency and clarity across all the EWIs.
Fixed¶
Fixed the bug the was causing the UserDefined symbols showing in the third party usages inventory.
Version 2.4.1 (Dec 4, 2024)¶
Application & CLI Version 2.4.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.13.1
Command Line Interface¶
Changed
Added timestamp to the output folder.
Snowpark Conversion Core 4.13.1¶
Added¶
Added ‘Source Language’ column to Library Mappings Table
Added
Othersas a new category in the Pandas API Summary table of the DetailedReport.docx
Changed¶
Updated the documentation for Python EWI
SPRKPY1058.Updated the message for the pandas EWI
PNDSPY1002to show the relate pandas element.Updated the way we created the .csv reports, now are overwritten after a second run .
Fixed¶
Fixed a bug that was causing Notebook files not being generated in the output.
Fixed the replacer for
getandsetmethods frompyspark.sql.conf.RuntimeConfig, the replacer now match the correct full names.Fixed query tag incorrect version.
Fixed UserDefined packages reported as ThirdPartyLib.
\
Version 2.3.1 (Nov 14, 2024)¶
Application & CLI Version 2.3.1¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.12.0
Desktop App¶
Fixed
Fix case-sensitive issues in –sql options.
Removed
Remove platform name from show-ac message.
Snowpark Conversion Core 4.12.0¶
Added¶
Added support for Snowpark Python 1.23.0 and 1.24.0.
Added a new EWI for the
pyspark.sql.dataframe.DataFrame.writeTofunction. All the usages of this function will now have the EWI SPRKPY1087.
Changed¶
Updated the documentation of the Scala EWIs from
SPRKSCL1137toSPRKSCL1156to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the Scala EWIs from
SPRKSCL1117toSPRKSCL1136to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the message that is shown for the following EWIs:
SPRKPY1082
SPRKPY1083
Updated the documentation of the Scala EWIs from
SPRKSCL1100toSPRKSCL1105, fromSPRKSCL1108toSPRKSCL1116; fromSPRKSCL1157toSPRKSCL1175; to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the mapping status of the following PySpark elements from NotSupported to Direct with EWI:
pyspark.sql.readwriter.DataFrameWriter.option=>snowflake.snowpark.DataFrameWriter.option: All the usages of this function now have the EWI SPRKPY1088pyspark.sql.readwriter.DataFrameWriter.options=>snowflake.snowpark.DataFrameWriter.options: All the usages of this function now have the EWI SPRKPY1089
Updated the mapping status of the following PySpark elements from Workaround to Rename:
pyspark.sql.readwriter.DataFrameWriter.partitionBy=>snowflake.snowpark.DataFrameWriter.partition_by
Updated EWI documentation: SPRKSCL1000, SPRKSCL1001, SPRKSCL1002, SPRKSCL1100, SPRKSCL1101, SPRKSCL1102, SPRKSCL1103, SPRKSCL1104, SPRKSCL1105.
Removed¶
Removed the
pyspark.sql.dataframe.DataFrameStatFunctions.writeToelement from the conversion status, this element does not exist.
Deprecated¶
Deprecated the following EWI codes:
SPRKPY1081
SPRKPY1084
Version 2.3.0 (Oct 30, 2024)¶
Application & CLI Version 2.3.0¶
Snowpark Conversion Core 4.11.0
Snowpark Conversion Core 4.11.0¶
Added¶
Added a new column called
Urlto theIssues.csvfile, which redirects to the corresponding EWI documentation.Added new EWIs for the following Spark elements:
[SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
[SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
[SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
[SPRKPY1085] pyspark.ml.feature.VectorAssembler
[SPRKPY1086] pyspark.ml.linalg.VectorUDT
Added 38 new Pandas elements:
pandas.core.frame.DataFrame.select
andas.core.frame.DataFrame.str
pandas.core.frame.DataFrame.str.replace
pandas.core.frame.DataFrame.str.upper
pandas.core.frame.DataFrame.to_list
pandas.core.frame.DataFrame.tolist
pandas.core.frame.DataFrame.unique
pandas.core.frame.DataFrame.values.tolist
pandas.core.frame.DataFrame.withColumn
pandas.core.groupby.generic._SeriesGroupByScalar
pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
pandas.core.indexes.datetimes.DatetimeIndex.year
pandas.core.series.Series.columns
pandas.core.tools.datetimes.to_datetime.date
pandas.core.tools.datetimes.to_datetime.dt.strftime
pandas.core.tools.datetimes.to_datetime.strftime
pandas.io.parsers.readers.TextFileReader.apply
pandas.io.parsers.readers.TextFileReader.astype
pandas.io.parsers.readers.TextFileReader.columns
pandas.io.parsers.readers.TextFileReader.copy
pandas.io.parsers.readers.TextFileReader.drop
pandas.io.parsers.readers.TextFileReader.drop_duplicates
pandas.io.parsers.readers.TextFileReader.fillna
pandas.io.parsers.readers.TextFileReader.groupby
pandas.io.parsers.readers.TextFileReader.head
pandas.io.parsers.readers.TextFileReader.iloc
pandas.io.parsers.readers.TextFileReader.isin
pandas.io.parsers.readers.TextFileReader.iterrows
pandas.io.parsers.readers.TextFileReader.loc
pandas.io.parsers.readers.TextFileReader.merge
pandas.io.parsers.readers.TextFileReader.rename
pandas.io.parsers.readers.TextFileReader.shape
pandas.io.parsers.readers.TextFileReader.to_csv
pandas.io.parsers.readers.TextFileReader.to_excel
pandas.io.parsers.readers.TextFileReader.unique
pandas.io.parsers.readers.TextFileReader.values
pandas.tseries.offsets
Version 2.2.3 (Oct 24, 2024)¶
Application Version 2.2.3¶
Included SMA Core Versions¶
Snowpark Conversion Core 4.10.0
Desktop App¶
Fixed¶
Fixed a bug that caused the SMA to show the label SnowConvert instead of Snowpark Migration Accelerator in the menu bar of the Windows version.
Fixed a bug that caused the SMA to crash when it did not have read and write permissions to the
.configdirectory in macOS and theAppDatadirectory in Windows.
Command Line Interface¶
Changed
Renamed the CLI executable name from
snowcttosma.Removed the source language argument so you no longer need to specify if you are running a Python or Scala assessment / conversion.
Expanded the command line arguments supported by the CLI by adding the following new arguments:
--enableJupyter|-j: Flag to indicate if the conversion of Databricks notebooks to Jupyter is enabled or not.--sql|-f: Database engine syntax to be used when a SQL command is detected.--customerEmail|-e: Configure the customer email.--customerCompany|-c: Configure the customer company.--projectName|-p: Configure the customer project.
Updated some texts to reflect the correct name of the application, ensuring consistency and clarity in all the messages.
Updated the terms of use of the application.
Updated and expanded the documentation of the CLI to reflect the latests features, enhancements and changes.
Updated the text that is shown before proceeding with the execution of the SMA to improve
Updated the CLI to accept “Yes” as a valid argument when prompting for user confirmation.
Allowed the CLI to continue the execution without waiting for user interaction by specifying the argument
-yor--yes.Updated the help information of the
--sqlargument to show the values that this argument expects.
Snowpark Conversion Core Version 4.10.0¶
Added¶
Added a new EWI for the
pyspark.sql.readwriter.DataFrameWriter.partitionByfunction. All the usages of this function will now have the EWI SPRKPY1081.Added a new column called
Technologyto theImportUsagesInventory.csvfile.
Changed¶
Updated the Third-Party Libraries readiness score to also take into account the
Unknownlibraries.Updated the
AssessmentFiles.zipfile to include.jsonfiles instead of.pamfiles.Improved the CSV to JSON conversion mechanism to make processing of inventories more performant.
Improved the documentation of the following EWIs:
SPRKPY1029
SPRKPY1054
SPRKPY1055
SPRKPY1063
SPRKPY1075
SPRKPY1076
Updated the mapping status of the following Spark Scala elements from
DirecttoRename.org.apache.spark.sql.functions.shiftLeft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftRight=>com.snowflake.snowpark.functions.shiftright
Updated the mapping status of the following Spark Scala elements from
Not SupportedtoDirect.org.apache.spark.sql.functions.shiftleft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftright=>com.snowflake.snowpark.functions.shiftright
Fixed¶
Fixed a bug that caused the SMA to incorrectly populate the
Origincolumn of theImportUsagesInventory.csvfile.Fixed a bug that caused the SMA to not classify imports of the libraries
io,json,loggingandunittestas Python built-in imports in theImportUsagesInventory.csvfile and in theDetailedReport.docxfile.
Version 2.2.2 (Oct 11, 2024)¶
Application Version 2.2.2¶
Features Updates include:
Snowpark Conversion Core 4.8.0
Snowpark Conversion Core Version 4.8.0¶
Added¶
Added
EwiCatalog.csvand .md files to reorganize documentationAdded the mapping status of
pyspark.sql.functions.lnDirect.Added a transformation for
pyspark.context.SparkContext.getOrCreatePlease check the EWI SPRKPY1080 for further details.
Added an improvement for the SymbolTable, infer type for parameters in functions.
Added SymbolTable supports static methods and do not assume the first parameter will be self for them.
Added documentation for missing EWIs
SPRKHVSQL1005
SPRKHVSQL1006
SPRKSPSQL1005
SPRKSPSQL1006
SPRKSCL1002
SPRKSCL1170
SPRKSCL1171
SPRKPY1057
SPRKPY1058
SPRKPY1059
SPRKPY1060
SPRKPY1061
SPRKPY1064
SPRKPY1065
SPRKPY1066
SPRKPY1067
SPRKPY1069
SPRKPY1070
SPRKPY1077
SPRKPY1078
SPRKPY1079
SPRKPY1101
Changed¶
Updated the mapping status of:
pyspark.sql.functions.array_removefromNotSupportedtoDirect.
Fixed¶
Fixed the Code File Sizing table in the Detail Report to exclude .sql and .hql files and added the Extra Large row in the table.
Fixed missing the
update_query_tagwhenSparkSessionis defined into multiple lines onPython.Fixed missing the
update_query_tagwhenSparkSessionis defined into multiple lines onScala.Fixed missing EWI
SPRKHVSQL1001to some SQL statements with parsing errors.Fixed keep new lines values inside string literals
Fixed the Total Lines of code showed in the File Type Summary Table
Fixed Parsing Score showed as 0 when recognize files successfully
Fixed LOC count in the cell inventory for Databricks Magic SQL Cells
Version 2.2.0 (Sep 26, 2024)¶
Application Version 2.2.0¶
Feature Updates include:
Snowpark Conversion Core 4.6.0
Snowpark Conversion Core Version 4.6.0¶
Added¶
Add transformation for
pyspark.sql.readwriter.DataFrameReader.parquet.Add transformation for
pyspark.sql.readwriter.DataFrameReader.optionwhen it is a Parquet method.
Changed¶
Updated the mapping status of:
pyspark.sql.types.StructType.fieldsfromNotSupportedtoDirect.pyspark.sql.types.StructType.namesfromNotSupportedtoDirect.pyspark.context.SparkContext.setLogLevelfromWorkaroundtoTransformation.More detail can be found in EWIs SPRKPY1078 and SPRKPY1079
org.apache.spark.sql.functions.roundfromWorkAroundtoDirect.org.apache.spark.sql.functions.udffromNotDefinedtoTransformation.More detail can be found in EWIs SPRKSCL1174 and SPRKSCL1175
Updated the mapping status of the following Spark elements from
DirectHelpertoDirect:org.apache.spark.sql.functions.hexorg.apache.spark.sql.functions.unhexorg.apache.spark.sql.functions.shiftleftorg.apache.spark.sql.functions.shiftrightorg.apache.spark.sql.functions.reverseorg.apache.spark.sql.functions.isnullorg.apache.spark.sql.functions.unix_timestamporg.apache.spark.sql.functions.randnorg.apache.spark.sql.functions.signumorg.apache.spark.sql.functions.signorg.apache.spark.sql.functions.collect_listorg.apache.spark.sql.functions.log10org.apache.spark.sql.functions.log1porg.apache.spark.sql.functions.base64org.apache.spark.sql.functions.unbase64org.apache.spark.sql.functions.regexp_extractorg.apache.spark.sql.functions.exprorg.apache.spark.sql.functions.date_formatorg.apache.spark.sql.functions.descorg.apache.spark.sql.functions.ascorg.apache.spark.sql.functions.sizeorg.apache.spark.sql.functions.locateorg.apache.spark.sql.functions.ntile
Fixed¶
Fixed value showed in the Percentage of total Pandas Api
Fixed Total percentage on ImportCalls table in the DetailReport
Deprecated¶
Deprecated the following EWI code:
SPRKSCL1115
Version 2.1.7 (Sep 12, 2024)¶
Application Version 2.1.7¶
Feature Updates include:
Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2
Snowpark Conversion Core Version 4.5.7¶
Hotfixed¶
Fixed Total row added on Spark Usages Summaries when there are not usages
Bumped of Python Assembly to Version=
1.3.111Parse trail comma in multiline arguments
Snowpark Conversion Core Version 4.5.2¶
Added¶
Added transformation for
pyspark.sql.readwriter.DataFrameReader.option:When the chain is from a CSV method call.
When the chain is from a JSON method call.
Added transformation for
pyspark.sql.readwriter.DataFrameReader.json.
Changed¶
Executed SMA on SQL strings passed to Python/Scala functions
Create AST in Scala/Python to emit temporary SQL unit
Create SqlEmbeddedUsages.csv inventory
Deprecate SqlStatementsInventroy.csv and SqlExtractionInventory.csv
Integrate EWI when the SQL literal could not be processed
Create new task to process SQL-embedded code
Collect info for SqlEmbeddedUsages.csv inventory in Python
Replace SQL transformed code to Literal in Python
Update test cases after implementation
Create table, views for telemetry in SqlEmbeddedUsages inventory
Collect info for SqlEmbeddedUsages.csv report in Scala
Replace SQL transformed code to Literal in Scala
Check line number order for Embedded SQL reporting
Filled the
SqlFunctionsInfo.csvwith the SQL functions documented for SparkSQL and HiveSQLUpdated the mapping status for:
org.apache.spark.sql.SparkSession.sparkContextfrom NotSupported to Transformation.org.apache.spark.sql.Builder.configfromNotSupportedtoTransformation. With this new mapping status, the SMA will remove all the usages of this function from the source code.
Version 2.1.6 (Sep 5, 2024)¶
Application Version 2.1.6¶
Hotfix change for Snowpark Engines Core version 4.5.1
Spark Conversion Core Version 4.5.1¶
Hotfix
Added a mechanism to convert the temporal Databricks notebooks generated by SMA in exported Databricks notebooks
Version 2.1.5 (Aug 29, 2024)¶
Application Version 2.1.5¶
Feature Updates include:
Updated Spark Conversion Core: 4.3.2
Spark Conversion Core Version 4.3.2¶
Added¶
Added the mechanism (via decoration) to get the line and the column of the elements identified in notebooks cells
Added an EWI for pyspark.sql.functions.from_json.
Added a transformation for pyspark.sql.readwriter.DataFrameReader.csv.
Enabled the query tag mechanism for Scala files.
Added the Code Analysis Score and additional links to the Detailed Report.
Added a column called OriginFilePath to InputFilesInventory.csv
Changed¶
Updated the mapping status of pyspark.sql.functions.from_json from Not Supported to Transformation.
Updated the mapping status of the following Spark elements from Workaround to Direct:
org.apache.spark.sql.functions.countDistinct
org.apache.spark.sql.functions.max
org.apache.spark.sql.functions.min
org.apache.spark.sql.functions.mean
Deprecated¶
Deprecated the following EWI codes:
SPRKSCL1135
SPRKSCL1136
SPRKSCL1153
SPRKSCL1155
Fixed¶
Fixed a bug that caused an incorrect calculation of the Spark API score.
Fixed an error that avoid copy SQL empty or commented files in the output folder.
Fixed a bug in the DetailedReport, the notebook stats LOC and Cell count is not accurate.
Version 2.1.2 (Aug 14, 2024)¶
Application Version 2.1.2¶
Feature Updates include:
Updated Spark Conversion Core: 4.2.0
Spark Conversion Core Version 4.2.0¶
Added¶
Add technology column to SparkUsagesInventory
Added an EWI for not defined SQL elements .
Added SqlFunctions Inventory
Collect info for SqlFunctions Inventory
Changed¶
The engine now processes and prints partially parsed Python files instead of leaving original file without modifications.
Python notebook cells that have parsing errors will also be processed and printed.
Fixed¶
Fixed
pandas.core.indexes.datetimes.DatetimeIndex.strftimewas being reported wrongly.Fix mismatch between SQL readiness score and SQL Usages by Support Status.
Fixed a bug that caused the SMA to report
pandas.core.series.Series.emptywith an incorrect mapping status.Fix mismatch between Spark API Usages Ready for Conversion in DetailedReport.docx is different than UsagesReadyForConversion row in Assessment.json.
Version 2.1.1 (Aug 8, 2024)¶
Application Version 2.1.1¶
Feature Updates include:
Updated Spark Conversion Core: 4.1.0
Spark Conversion Core Version 4.1.0¶
Added¶
Added the following information to the
AssessmentReport.jsonfileThe third-party libraries readiness score.
The number of third-party library calls that were identified.
The number of third-party library calls that are supported in Snowpark.
The color code associated with the third-party readiness score, the Spark API readiness score, and the SQL readiness score.
Transformed
SqlSimpleDataTypein Spark create tables.Added the mapping of
pyspark.sql.functions.getas direct.Added the mapping of
pyspark.sql.functions.to_varcharas direct.As part of the changes after unification, the tool now generates an execution info file in the Engine.
Added a replacer for
pyspark.sql.SparkSession.builder.appName.
Changed¶
Updated the mapping status for the following Spark elements
From Not Supported to Direct mapping:
pyspark.sql.functions.signpyspark.sql.functions.signum
Changed the Notebook Cells Inventory report to indicate the kind of content for every cell in the column Element
Added a
SCALA_READINESS_SCOREcolumn that reports the readiness score as related only to references to the Spark API in Scala files.Partial support to transform table properties in
ALTER TABLEandALTER VIEWUpdated the conversion status of the node
SqlSimpleDataTypefrom Pending to Transformation in Spark create tablesUpdated the version of the Snowpark Scala API supported by the SMA from
1.7.0to1.12.1:Updated the mapping status of:
org.apache.spark.sql.SparkSession.getOrCreatefrom Rename to Directorg.apache.spark.sql.functions.sumfrom Workaround to Direct
Updated the version of the Snowpark Python API supported by the SMA from
1.15.0to1.20.0:Updated the mapping status of:
pyspark.sql.functions.arrays_zipfrom Not Supported to Direct
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.anypandas.core.frame.DataFrame.applymap
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.core.frame.DataFrame.groupbypandas.core.frame.DataFrame.indexpandas.core.frame.DataFrame.Tpandas.core.frame.DataFrame.to_dict
From Not Supported to Rename mapping:
pandas.core.frame.DataFrame.map
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.wherepandas.core.groupby.generic.SeriesGroupBy.aggpandas.core.groupby.generic.SeriesGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.aggpandas.core.groupby.generic.DataFrameGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.apply
Not Supported mappings:
pandas.core.frame.DataFrame.to_parquetpandas.core.generic.NDFrame.to_csvpandas.core.generic.NDFrame.to_excelpandas.core.generic.NDFrame.to_sql
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.emptypandas.core.series.Series.applypandas.core.reshape.tile.qcut
Direct mappings with EWI:
pandas.core.series.Series.fillnapandas.core.series.Series.astypepandas.core.reshape.melt.meltpandas.core.reshape.tile.cutpandas.core.reshape.pivot.pivot_table
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.dtpandas.core.series.Series.groupbypandas.core.series.Series.locpandas.core.series.Series.shapepandas.core.tools.datetimes.to_datetimepandas.io.excel._base.ExcelFile
Not Supported mappings:
pandas.core.series.Series.dt.strftime
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.parquet.read_parquetpandas.io.parsers.readers.read_csv
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.pickle.read_picklepandas.io.sql.read_sqlpandas.io.sql.read_sql_query
Updated the description of Understanding the SQL Readiness Score.
Updated
PyProgramCollectorto collect the packages and populate the current packages inventory with data from Python source code.Updated the mapping status of
pyspark.sql.SparkSession.builder.appNamefrom Rename to Transformation.Removed the following Scala integration tests:
AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentModeAssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScalaAssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
Updated the mapping status of
pandas.core.generic.NDFrame.shapefrom Not Supported to Direct.Updated the mapping status of
pandas.core.seriesfrom Not Supported to Direct.
Deprecated¶
Deprecated the EWI code
SPRKSCL1160sinceorg.apache.spark.sql.functions.sumis now a direct mapping.
Fixed¶
Fixed a bug by not supporting Custom Magics without arguments in Jupyter Notebook cells.
Fixed incorrect generation of EWIs in the issues.csv report when parsing errors occur.
Fixed a bug that caused the SMA not to process the Databricks exported notebook as Databricks notebooks.
Fixed a stack overflow error while processing clashing type names of declarations created inside package objects.
Fixed the processing of complex lambda type names involving generics, e.g.,
def func[X,Y](f: (Map[Option[X], Y] => Map[Y, X]))...Fixed a bug that caused the SMA to add a PySpark EWI code instead of a Pandas EWI code to the Pandas elements that are not yet recognized.
Fixed a typo in the detailed report template: renaming a column from “Percentage of all Python Files” to “Percentage of all files”.
Fixed a bug where
pandas.core.series.Series.shapewas wrongly reported.