Snowpark Migration Accelerator：版本说明¶

请注意，下方的版本说明按版本日期组织整理。下方将显示应用程序和转换核心的版本号。

Version 2.10.0 (Sep 24, 2025)¶

Application & CLI Version 2.10.0¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.62

已添加¶

Added functionality to migrate SQL embedded with Python format interpolation.
Added support for DataFrame.select and DataFrame.sort transformations for greater data processing flexibility.

更改¶

Bumped the supported versions of Snowpark Python API and Snowpark Pandas API to 1.36.0.
Updated the mapping status of pandas.core.frame.DataFrame.boxplot from Not Supported to Direct.
Updated the mapping status of DataFrame.select, Dataset.select, DataFrame.sort and Dataset.sort from Direct to Transformation.
Snowpark Scala allows a sequence of columns to be passed directly to the select and sort functions, so this transformation changes all the usages such as df.select(cols: _*) to df.select(cols) and df.sort(cols: _*) to df.sort(cols).
Bumped Python AST and Parser version to 149.1.9.
Updated the status to Direct for pandas functions:
- pandas.core.frame.DataFrame.to_excel
- pandas.core.series.Series.to_excel
- pandas.io.feather_format.read_feather
- pandas.io.orc.read_orc
- pandas.io.stata.read_stata
Updated the status for pyspark.sql.pandas.map_ops.PandasMapOpsMixin.mapInPandas to workaround using the EWI SPRKPY1102.

已修复¶

Fixed issue that affected SqlEmbedded transformations when using chained method calls.
Fixed transformations involving PySqlExpr using the new PyLiteralSql to avoid losing Tails.
Resolved internal stability issues to improve tool robustness and reliability.

Version 2.7.7 (Aug 28, 2025)¶

Application & CLI Version 2.7.7¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.46

已添加¶

Added new Pandas EWI documentation PNDSPY1011.
Added support to the following Pandas functions:
- pandas.core.algorithms.unique
- pandas.core.dtypes.missing.isna
- pandas.core.dtypes.missing.isnull
- pandas.core.dtypes.missing.notna
- pandas.core.dtypes.missing.notnull
- pandas.core.resample.Resampler.count
- pandas.core.resample.Resampler.max
- pandas.core.resample.Resampler.mean
- pandas.core.resample.Resampler.median
- pandas.core.resample.Resampler.min
- pandas.core.resample.Resampler.size
- pandas.core.resample.Resampler.sum
- pandas.core.arrays.timedeltas.TimedeltaArray.total_seconds
- pandas.core.series.Series.get
- pandas.core.series.Series.to_frame
- pandas.core.frame.DataFrame.assign
- pandas.core.frame.DataFrame.get
- pandas.core.frame.DataFrame.to_numpy
- pandas.core.indexes.base.Index.is_unique
- pandas.core.indexes.base.Index.has_duplicates
- pandas.core.indexes.base.Index.shape
- pandas.core.indexes.base.Index.array
- pandas.core.indexes.base.Index.str
- pandas.core.indexes.base.Index.equals
- pandas.core.indexes.base.Index.identical
- pandas.core.indexes.base.Index.unique

Added support to the following Spark Scala functions:

org.apache.spark.sql.functions.format_number
org.apache.spark.sql.functions.from_unixtime
org.apache.spark.sql.functions.instr
org.apache.spark.sql.functions.months_between
org.apache.spark.sql.functions.pow
org.apache.spark.sql.functions.to_unix_timestamp
org.apache.spark.sql.Row.getAs

更改¶

Bumped the version of Snowpark Pandas API supported by the SMA to 1.33.0.
Bumped the version of Snowpark Scala API supported by the SMA to 1.16.0.
Updated the mapping status of pyspark.sql.group.GroupedData.pivot from Transformation to Direct.
Updated the mapping status of org.apache.spark.sql.Builder.master from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.
Updated the mapping status of org.apache.spark.sql.types.StructType.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.Row.fieldIndex from NotSupported to Direct.
Updated the mapping status of org.apache.spark.sql.SparkSession.stop from NotSupported to Rename. All the identified usages of this element are renamed to com.snowflake.snowpark.Session.close during code conversion.
Updated the mapping status of org.apache.spark.sql.DataFrame.unpersist and org.apache.spark.sql.Dataset.unpersist from NotSupported to Transformation. This transformation removes all the identified usages of this element during code conversion.

已修复¶

Fixed continuation backslash on removed tailed functions.
Fix the LIBRARY_PREFIX column in the ConversionStatusLibraries.csv file to use the right identifier for scikit-learn library family (scikit-*).
Fixed bug not parsing multiline grouped operations.

Version 2.9.0 (Sep 09, 2025)¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.53

已添加¶

The following mappings are now performed for org.apache.spark.sql.Dataset[T]:
- org.apache.spark.sql.Dataset.union is now com.snowflake.snowpark.DataFrame.unionAll
- org.apache.spark.sql.Dataset.unionByName is now com.snowflake.snowpark.DataFrame.unionAllByName
Added support for org.apache.spark.sql.functions.broadcast as a transformation.

更改¶

Increased the supported Snowpark Python API version for SMA from 1.27.0 to 1.33.0.
The status for the pyspark.sql.function.randn function has been updated to Direct.

已修复¶

Resolved an issue where org.apache.spark.SparkContext.parallelize was not resolving and now supports it as a transformation.
Fixed the Dataset.persist transformation to work with any type of Dataset, not just Dataset[Row].

Version 2.7.6 (Jul 17, 2025)¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.30

已添加¶

Adjusted mappings for spark.DataReader methods.
DataFrame.union is now DataFrame.unionAll.
DataFrame.unionByName is now DataFrame.unionAllByName.
Added multi-level artifact dependency columns in artifact inventory
Added new Pandas EWIs documentation, from PNDSPY1005 to PNDSPY1010.
Added a specific EWI for pandas.core.series.Series.apply.

更改¶

Bumped the version of Snowpark Pandas API supported by the SMA from 1.27.0 to 1.30.0.

已修复¶

Fixed an issue with missing values in the formula to get the SQL readiness score.
Fixed a bug that was causing some Pandas elements to have the default EWI message from PySpark.

Version 2.7.5 (Jul 2, 2025)¶

Application & CLI Version 2.7.5¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.19

更改¶

Refactored Pandas Imports: Pandas imports now use `modin.pandas` instead of snowflake.snowpark.modin.pandas.
Improved `dbutils` and Magic Commands Transformation:
- A new sfutils.py file is now generated, and all dbutils prefixes are replaced with sfutils.
- For Databricks (DBX) notebooks, an implicit import for sfutils is automatically added.
- The sfutils module simulates various dbutils methods, including file system operations (dbutils.fs) via a defined Snowflake FileSystem (SFFS) stage, and handles notebook execution (dbutils.notebook.run) by transforming it to EXECUTE NOTEBOOK SQL functions.
- dbutils.notebook.exit is removed as it is not required in Snowflake.

已修复¶

Updates in SnowConvert Reports: SnowConvert reports now include the CellId column when instances originate from SMA, and the FileName column displays the full path.
Updated Artifacts Dependency for SnowConvert Reports: The SMA's artifact inventory report, which was previously impacted by the integration of SnowConvert, has been restored. This update enables the SMA tool to accurately capture and analyze Object References and Missing Object References directly from SnowConvert reports, thereby ensuring the correct retrieval of SQL dependencies for the inventory.

Version 2.7.4 (Jun 26, 2025)¶

Application & CLI Version 2.7.4¶

Desktop App

已添加¶

Added telemetry improvements.

已修复¶

Fix documentation links in conversion settings pop-up and Pandas EWIs.

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.16

已添加¶

Transforming Spark XML to Snowpark
Databricks SQL option in the SQL source language
Transform JDBC read connections.

更改¶

All the SnowConvert reports are copied to the backup Zip file.
The folder is renamed from SqlReports to SnowConvertReports.
SqlFunctionsInventory is moved to the folder Reports.
All the SnowConvert Reports are sent to Telemetry.

已修复¶

Non-deterministic issue with SQL Readiness Score.
Fixed a false-positive critical result that made the desktop crash.
Fixed issue causing the Artifacts dependency report not to show the SQL objects.

Version 2.7.2 (Jun 10, 2025)¶

Application & CLI Version 2.7.2¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.2

已修复¶

Addressed an issue with SMA execution on the latest Windows OS, as previously reported. This fix resolves the issues encountered in version 2.7.1.

Version 2.7.1 (Jun 9, 2025)¶

Application & CLI Version 2.7.1¶

Included SMA Core Versions¶

Snowpark Conversion Core 8.0.1

已添加¶

The Snowpark Migration Accelerator (SMA) now orchestrates SnowConvert (https://docs.snowconvert.com/sc/general/about) to process SQL found in user workloads, including embedded SQL in Python / Scala code, Notebook SQL cells, .sql files, and .hql files.

The SnowConvert now enhances the previous SMA capabilities:

Spark SQL (https://docs.snowconvert.com/sc/translation-references/spark-dbx)

A new folder in the Reports called SQL Reports contains the reports generated by SnowConvert.

Known Issues¶

The previous SMA version for SQL reports will appear empty for the following:

For Reports/SqlElementsInventory.csv, partially covered by the Reports/SqlReports/Elements.yyyymmdd.hhmmss.csv.
For Reports/SqlFunctionsInventory.csv refer to the new location with the same name at Reports/SqlReports/SqlFunctionsInventory.csv

The artifact dependency inventory:

In the ArtifactDependencyInventory the column for the SQL Object will appear empty

Version 2.6.10 (May 5, 2025)¶

Application & CLI Version 2.6.10¶

Included SMA Core Versions¶

Snowpark Conversion Core 7.4.0

已修复¶

Fixed wrong values in the 'checkpoints.json' file.
- The 'sample' value was without decimals (for integer values) and quotes.
- The 'entryPoint' value had dots instead of slashes and was missing the file extension.
Updated the default value to TRUE for the setting 'Convert DBX notebooks to Snowflake notebooks'

Version 2.6.8 (Apr 28, 2025)¶

Application & CLI Version 2.6.8¶

Desktop App¶

Added checkpoints execution settings mechanism recognition.
Added a mechanism to collect DBX magic commands into DbxElementsInventory.csv
Added 'checkpoints.json' generation into the input directory.
Added a new EWI for all not supported magic command.
Added the collection of dbutils into DbxElementsInventory.csv from scala source notebooks

Included SMA Core Versions¶

Snowpark Conversion Core 7.2.53

更改¶

Updates made to handle transformations from DBX Scala elements to Jupyter Python elements, and to comment the entire code from the cell.
Updates made to handle transformations from dbutils.notebook.run and “r" commands, for the last one, also comment out the entire code from the cell.
Updated the name and the letter of the key to make the conversion of the notebook files.

已修复¶

Fixed the bug that was causing the transformation of DBX notebooks into .ipynb files to have the wrong format.
Fixed the bug that was causing .py DBX notebooks to not be transformable into .ipynb files.
Fixed a bug that was causing comments to be missing in the output code of DBX notebooks.
Fixed a bug that was causing raw Scala files to be converted into ipynb files.

Version 2.6.7 (Apr 21, 2025)¶

Application & CLI Version 2.6.7¶

Included SMA Core Versions¶

Snowpark Conversion Core 7.2.42

更改¶

Updated DataFramesInventory to fill EntryPoints column

Version 2.6.6 (Apr 7, 2025)¶

Application & CLI Version 2.6.6¶

Desktop App¶

已添加¶

Update DBx EWI link in the UI results page

Included SMA Core Versions¶

Snowpark Conversion Core 7.2.39

已添加¶

Added Execution Flow inventory generation.
Added implicit session setup in every DBx notebook transformation

更改¶

Renamed the DbUtilsUsagesInventory.csv to DbxElementsInventory.csv

已修复¶

Fixed a bug that caused a Parsing error when a backslash came after a type hint.
Fixed relative imports that do not start with a dot and relative imports with a star.

Version 2.6.5 (Mar 27, 2025)¶

Application & CLI Version 2.6.5¶

Desktop App¶

已添加¶

Added a new conversion setting toggle to enable or disable Sma-Checkpoints feature.
Fix report issue to not crash when post api returns 500

Included SMA Core Versions¶

Snowpark Conversion Core 7.2.26

已添加¶

Added generation of the checkpoints.json file into the output folder based on the DataFramesInventory.csv.
Added "disableCheckpoints" flag into the CLI commands and additional parameters of the code processor.
Added a new replacer for Python to transform the dbutils.notebook.run node.
Added new replacers to transform the magic %run command.
Added new replacers (Python and Scala) to remove the dbutils.notebook.exit node.
Added Location column to artifacts inventory.

更改¶

Refactored the normalized directory separator used in some parts of the solution.
Centralized the DBC extraction working folder name handling.
Updated Snowpark and Pandas version to v1.27.0
Updated the artifacts inventory columns to:
- Name -> Dependency
- File -> FileId
- Status -> Status_detail
Added new column to the artifacts inventory:
- Success

已修复¶

Dataframes inventory was not being uploaded to the stage correctly.

Version 2.6.4 (Mar 12, 2025)¶

Application & CLI Version 2.6.4¶

Included SMA Core Versions ¶

Snowpark Conversion Core 7.2.0

Added ¶

An Artifact Dependency Inventory
A replacer and EWI for pyspark.sql.types.StructType.fieldNames method to snowflake.snowpark.types.StructType.fieldNames attribute.
The following PySpark functions with the status:

Direct Status

pyspark.sql.functions.bitmap_bit_position
pyspark.sql.functions.bitmap_bucket_number
pyspark.sql.functions.bitmap_construct_agg
pyspark.sql.functions.equal_null
pyspark.sql.functions.ifnull
pyspark.sql.functions.localtimestamp
pyspark.sql.functions.max_by
pyspark.sql.functions.min_by
pyspark.sql.functions.nvl
pyspark.sql.functions.regr_avgx
pyspark.sql.functions.regr_avgy
pyspark.sql.functions.regr_count
pyspark.sql.functions.regr_intercept
pyspark.sql.functions.regr_slope
pyspark.sql.functions.regr_sxx
pyspark.sql.functions.regr_sxy
pyspark.sql.functions.regr

NotSupported

pyspark.sql.functions.map_contains_key
pyspark.sql.functions.position
pyspark.sql.functions.regr_r2
pyspark.sql.functions.try_to_binary

The following Pandas functions with status

pandas.core.series.Series.str.ljust
pandas.core.series.Series.str.center
pandas.core.series.Series.str.pad
pandas.core.series.Series.str.rjust

Update the following Pyspark functions with the status

From WorkAround to Direct

pyspark.sql.functions.acosh
pyspark.sql.functions.asinh
pyspark.sql.functions.atanh
pyspark.sql.functions.instr
pyspark.sql.functions.log10
pyspark.sql.functions.log1p
pyspark.sql.functions.log2

From NotSupported to Direct

pyspark.sql.functions.bit_length
pyspark.sql.functions.cbrt
pyspark.sql.functions.nth_value
pyspark.sql.functions.octet_length
pyspark.sql.functions.base64
pyspark.sql.functions.unbase64

Updated the folloing Pandas functions with the status

From NotSupported to Direct

pandas.core.frame.DataFrame.pop
pandas.core.series.Series.between
pandas.core.series.Series.pop

Version 2.6.3 (Mar 6, 2025)¶

Application & CLI Version 2.6.3¶

Included SMA Core Versions ¶

Snowpark Conversion Core 7.1.13

Added ¶

Added csv generator class for new inventory creation.
Added "full_name" column to import usages inventory.
Added transformation from pyspark.sql.functions.concat_ws to snowflake.snowpark.functions._concat_ws_ignore_nulls.
Added logic for generation of checkpoints.json.
Added the inventories:
- DataFramesInventory.csv.
- CheckpointsInventory.csv

Version 2.6.0 (Feb 21, 2025)¶

Application & CLI Version 2.6.0¶

Desktop App ¶

Updated the licensing agreement, acceptance is required.

Included SMA Core Versions¶

Snowpark Conversion Core 7.1.2

已添加

Updated the mapping status for the following PySpark elements, from NotSupported to Direct

pyspark.sql.types.ArrayType.json
pyspark.sql.types.ArrayType.jsonValue
pyspark.sql.types.ArrayType.simpleString
pyspark.sql.types.ArrayType.typeName
pyspark.sql.types.AtomicType.json
pyspark.sql.types.AtomicType.jsonValue
pyspark.sql.types.AtomicType.simpleString
pyspark.sql.types.AtomicType.typeName
pyspark.sql.types.BinaryType.json
pyspark.sql.types.BinaryType.jsonValue
pyspark.sql.types.BinaryType.simpleString
pyspark.sql.types.BinaryType.typeName
pyspark.sql.types.BooleanType.json
pyspark.sql.types.BooleanType.jsonValue
pyspark.sql.types.BooleanType.simpleString
pyspark.sql.types.BooleanType.typeName
pyspark.sql.types.ByteType.json
pyspark.sql.types.ByteType.jsonValue
pyspark.sql.types.ByteType.simpleString
pyspark.sql.types.ByteType.typeName
pyspark.sql.types.DecimalType.json
pyspark.sql.types.DecimalType.jsonValue
pyspark.sql.types.DecimalType.simpleString
pyspark.sql.types.DecimalType.typeName
pyspark.sql.types.DoubleType.json
pyspark.sql.types.DoubleType.jsonValue
pyspark.sql.types.DoubleType.simpleString
pyspark.sql.types.DoubleType.typeName
pyspark.sql.types.FloatType.json
pyspark.sql.types.FloatType.jsonValue
pyspark.sql.types.FloatType.simpleString
pyspark.sql.types.FloatType.typeName
pyspark.sql.types.FractionalType.json
pyspark.sql.types.FractionalType.jsonValue
pyspark.sql.types.FractionalType.simpleString
pyspark.sql.types.FractionalType.typeName
pyspark.sql.types.IntegerType.json
pyspark.sql.types.IntegerType.jsonValue
pyspark.sql.types.IntegerType.simpleString
pyspark.sql.types.IntegerType.typeName
pyspark.sql.types.IntegralType.json
pyspark.sql.types.IntegralType.jsonValue
pyspark.sql.types.IntegralType.simpleString
pyspark.sql.types.IntegralType.typeName
pyspark.sql.types.LongType.json
pyspark.sql.types.LongType.jsonValue
pyspark.sql.types.LongType.simpleString
pyspark.sql.types.LongType.typeName
pyspark.sql.types.MapType.json
pyspark.sql.types.MapType.jsonValue
pyspark.sql.types.MapType.simpleString
pyspark.sql.types.MapType.typeName
pyspark.sql.types.NullType.json
pyspark.sql.types.NullType.jsonValue
pyspark.sql.types.NullType.simpleString
pyspark.sql.types.NullType.typeName
pyspark.sql.types.NumericType.json
pyspark.sql.types.NumericType.jsonValue
pyspark.sql.types.NumericType.simpleString
pyspark.sql.types.NumericType.typeName
pyspark.sql.types.ShortType.json
pyspark.sql.types.ShortType.jsonValue
pyspark.sql.types.ShortType.simpleString
pyspark.sql.types.ShortType.typeName
pyspark.sql.types.StringType.json
pyspark.sql.types.StringType.jsonValue
pyspark.sql.types.StringType.simpleString
pyspark.sql.types.StringType.typeName
pyspark.sql.types.StructType.json
pyspark.sql.types.StructType.jsonValue
pyspark.sql.types.StructType.simpleString
pyspark.sql.types.StructType.typeName
pyspark.sql.types.TimestampType.json
pyspark.sql.types.TimestampType.jsonValue
pyspark.sql.types.TimestampType.simpleString
pyspark.sql.types.TimestampType.typeName
pyspark.sql.types.StructField.simpleString
pyspark.sql.types.StructField.typeName
pyspark.sql.types.StructField.json
pyspark.sql.types.StructField.jsonValue
pyspark.sql.types.DataType.json
pyspark.sql.types.DataType.jsonValue
pyspark.sql.types.DataType.simpleString
pyspark.sql.types.DataType.typeName
pyspark.sql.session.SparkSession.getActiveSession
pyspark.sql.session.SparkSession.version
pandas.io.html.read_html
pandas.io.json._normalize.json_normalize
pyspark.sql.types.ArrayType.fromJson
pyspark.sql.types.MapType.fromJson
pyspark.sql.types.StructField.fromJson
pyspark.sql.types.StructType.fromJson
pandas.core.groupby.generic.DataFrameGroupBy.pct_change
pandas.core.groupby.generic.SeriesGroupBy.pct_change

Updated the mapping status for the following Pandas elements, from NotSupported to Direct

pandas.io.html.read_html
pandas.io.json._normalize.json_normalize
pandas.core.groupby.generic.DataFrameGroupBy.pct_change
pandas.core.groupby.generic.SeriesGroupBy.pct_change

Updated the mapping status for the following PySpark elements, from Rename to Direct

pyspark.sql.functions.collect_list
pyspark.sql.functions.size

Fixed ¶

Standardized the format of the version number in the inventories.

Version 2.5.2 (Feb 5, 2025)¶

修补程序：应用程序和 CLI 版本 2.5.2¶

Desktop App¶

修复了在示例项目选项中进行转换时出现的问题。

Included SMA Core Versions¶

Snowpark Conversion Core 5.3.0

Version 2.5.1 (Feb 4, 2025)¶

应用程序和 CLI 版本 2.5.1¶

Desktop App¶

添加了在用户无写入权限时适用的新模式。
更新了许可协议，用户需要接受此协议。

CLI¶

修复了显示“--version”或“-v”时，CLI 屏幕中年份的问题

包含 SMA 核心版本 included-sma-core-versions¶

Snowpark Conversion Core 5.3.0

已添加¶

Added the following Python Third-Party libraries with Direct status:

about-time
affinegap
aiohappyeyeballs
alibi-detect
alive-progress
allure-nose2
allure-robotframework
anaconda-cloud-cli
anaconda-mirror
astropy-iers-data
asynch
asyncssh
autots
autoviml
aws-msk-iam-sasl-signer-python
azure-functions
backports.tarfile
blas
bottle
bson
cairo
capnproto
captum
categorical-distance
census
clickhouse-driver
clustergram
cma
conda-anaconda-telemetry
configspace
cpp-expected
dask-expr
data-science-utils
databricks-sdk
datetime-distance
db-dtypes
dedupe
dedupe-variable-datetime
dedupe_lehvenshtein_search
dedupe_levenshtein_search
diff-cover
diptest
dmglib
docstring_parser
doublemetaphone
dspy-ai
econml
emcee
emoji
environs
eth-abi
eth-hash
eth-typing
eth-utils
expat
filetype
fitter
flask-cors
fpdf2
frozendict
gcab
geojson
gettext
glib-tools
google-ads
google-ai-generativelanguage
google-api-python-client
google-auth-httplib2
google-cloud-bigquery
google-cloud-bigquery-core
google-cloud-bigquery-storage
google-cloud-bigquery-storage-core
google-cloud-resource-manager
google-generativeai
googlemaps
grapheme
graphene
graphql-relay
gravis
greykite
grpc-google-iam-v1
harfbuzz
hatch-fancy-pypi-readme
haversine
hiclass
hicolor-icon-theme
highered
hmmlearn
holidays-ext
httplib2
icu
imbalanced-ensemble
immutabledict
importlib-metadata
importlib-resources
inquirerpy
iterative-telemetry
jaraco.context
jaraco.test
jiter
jiwer
joserfc
jsoncpp
jsonpath
jsonpath-ng
jsonpath-python
kagglehub
keplergl
kt-legacy
langchain-community
langchain-experimental
langchain-snowflake
langchain-text-splitters
libabseil
libflac
libgfortran-ng
libgfortran5
libglib
libgomp
libgrpc
libgsf
libmagic
libogg
libopenblas
libpostal
libprotobuf
libsentencepiece
libsndfile
libstdcxx-ng
libtheora
libtiff
libvorbis
libwebp
lightweight-mmm
litestar
litestar-with-annotated-types
litestar-with-attrs
litestar-with-cryptography
litestar-with-jinja
litestar-with-jwt
litestar-with-prometheus
litestar-with-structlog
lunarcalendar-ext
matplotlib-venn
metricks
mimesis
modin-ray
momepy
mpg123
msgspec
msgspec-toml
msgspec-yaml
msitools
multipart
namex
nbconvert-all
nbconvert-core
nbconvert-pandoc
nlohmann_json
numba-cuda
numpyro
office365-rest-python-client
openapi-pydantic
opentelemetry-distro
opentelemetry-instrumentation
opentelemetry-instrumentation-system-metrics
optree
osmnx
pathlib
pdf2image
pfzy
pgpy
plumbum
pm4py
polars
polyfactory
poppler-cpp
postal
pre-commit
prompt-toolkit
propcache
py-partiql-parser
py_stringmatching
pyatlan
pyfakefs
pyfhel
pyhacrf-datamade
pyiceberg
pykrb5
pylbfgs
pymilvus
pymoo
pynisher
pyomo
pypdf
pypdf-with-crypto
pypdf-with-full
pypdf-with-image
pypng
pyprind
pyrfr
pysoundfile
pytest-codspeed
pytest-trio
python-barcode
python-box
python-docx
python-gssapi
python-iso639
python-magic
python-pandoc
python-zstd
pyuca
pyvinecopulib
pyxirr
qrcode
rai-sdk
ray-client
ray-observability
readline
rich-click
rouge-score
ruff
scikit-criteria
scikit-mobility
sentencepiece-python
sentencepiece-spm
setuptools-markdown
setuptools-scm
setuptools-scm-git-archive
shareplum
simdjson
simplecosine
sis-extras
slack-sdk
smac
snowflake-sqlalchemy
snowflake_legacy
socrata-py
spdlog
sphinxcontrib-images
sphinxcontrib-jquery
sphinxcontrib-youtube
splunk-opentelemetry
sqlfluff
squarify
st-theme
statistics
streamlit-antd-components
streamlit-condition-tree
streamlit-echarts
streamlit-feedback
streamlit-keplergl
streamlit-mermaid
streamlit-navigation-bar
streamlit-option-menu
strictyaml
stringdist
sybil
tensorflow-cpu
tensorflow-text
tiledb-ptorchaudio
torcheval
trio-websocket
trulens-connectors-snowflake
trulens-core
trulens-dashboard
trulens-feedback
trulens-otel-semconv
trulens-providers-cortex
tsdownsample
typing
typing-extensions
typing_extensions
unittest-xml-reporting
uritemplate
us
uuid6
wfdb
wsproto
zlib
zope.index

Added the following Python BuiltIn libraries with Direct status:

aifc
array
ast
asynchat
asyncio
asyncore
atexit
audioop
base64
bdb
binascii
bitsect
builtins
bz2
calendar
cgi
cgitb
chunk
cmath
cmd
code
codecs
codeop
colorsys
compileall
concurrent
contextlib
contextvars
copy
copyreg
cprofile
crypt
csv
ctypes
curses
dbm
difflib
dis
distutils
doctest
email
ensurepip
enum
errno
faulthandler
fcntl
filecmp
fileinput
fnmatch
fractions
ftplib
functools
gc
getopt
getpass
gettext
graphlib
grp
gzip
hashlib
heapq
hmac
html
http
idlelib
imaplib
imghdr
imp
importlib
inspect
ipaddress
itertools
keyword
linecache
locale
lzma
mailbox
mailcap
marshal
math
mimetypes
mmap
modulefinder
msilib
multiprocessing
netrc
nis
nntplib
numbers
operator
optparse
ossaudiodev
pdb
pickle
pickletools
pipes
pkgutil
platform
plistlib
poplib
posix
pprint
profile
pstats
pty
pwd
py_compile
pyclbr
pydoc
queue
quopri
random
re
reprlib
resource
rlcompleter
runpy
sched
secrets
select
selectors
shelve
shlex
signal
site
sitecustomize
smtpd
smtplib
sndhdr
socket
socketserver
spwd
sqlite3
ssl
stat
string
stringprep
struct
subprocess
sunau
symtable
sysconfig
syslog
tabnanny
tarfile
telnetlib
tempfile
termios
test
textwrap
threading
timeit
tkinter
token
tokenize
tomllib
trace
traceback
tracemalloc
tty
turtle
turtledemo
types
unicodedata
urllib
uu
uuid
venv
warnings
wave
weakref
webbrowser
wsgiref
xdrlib
xml
xmlrpc
zipapp
zipfile
zipimport
zoneinfo

Added the following Python BuiltIn libraries with NotSupported status:

msvcrt
winreg
winsound

更改¶

将 .NET 版本更新到 v9.0.0。
已改进 EWI SPRKPY1068。
将 SMA 支持的 Snowpark Python API 版本从 1.24.0 升级至 1.25.0。
更新了详细报告模板，现在包含适用于 Pandas 的 Snowpark 版本。
将以下库从 ThirdPartyLib 更改为 BuiltIn。
- configparser
- dataclasses
- pathlib
- readline
- statistics
- zlib

Updated the mapping status for the following Pandas elements, from Direct to Partial:

pandas.core.frame.DataFrame.add
pandas.core.frame.DataFrame.aggregate
pandas.core.frame.DataFrame.all
pandas.core.frame.DataFrame.apply
pandas.core.frame.DataFrame.astype
pandas.core.frame.DataFrame.cumsum
pandas.core.frame.DataFrame.div
pandas.core.frame.DataFrame.dropna
pandas.core.frame.DataFrame.eq
pandas.core.frame.DataFrame.ffill
pandas.core.frame.DataFrame.fillna
pandas.core.frame.DataFrame.floordiv
pandas.core.frame.DataFrame.ge
pandas.core.frame.DataFrame.groupby
pandas.core.frame.DataFrame.gt
pandas.core.frame.DataFrame.idxmax
pandas.core.frame.DataFrame.idxmin
pandas.core.frame.DataFrame.inf
pandas.core.frame.DataFrame.join
pandas.core.frame.DataFrame.le
pandas.core.frame.DataFrame.loc
pandas.core.frame.DataFrame.lt
pandas.core.frame.DataFrame.mask
pandas.core.frame.DataFrame.merge
pandas.core.frame.DataFrame.mod
pandas.core.frame.DataFrame.mul
pandas.core.frame.DataFrame.ne
pandas.core.frame.DataFrame.nunique
pandas.core.frame.DataFrame.pivot_table
pandas.core.frame.DataFrame.pow
pandas.core.frame.DataFrame.radd
pandas.core.frame.DataFrame.rank
pandas.core.frame.DataFrame.rdiv
pandas.core.frame.DataFrame.rename
pandas.core.frame.DataFrame.replace
pandas.core.frame.DataFrame.resample
pandas.core.frame.DataFrame.rfloordiv
pandas.core.frame.DataFrame.rmod
pandas.core.frame.DataFrame.rmul
pandas.core.frame.DataFrame.rolling
pandas.core.frame.DataFrame.round
pandas.core.frame.DataFrame.rpow
pandas.core.frame.DataFrame.rsub
pandas.core.frame.DataFrame.rtruediv
pandas.core.frame.DataFrame.shift
pandas.core.frame.DataFrame.skew
pandas.core.frame.DataFrame.sort_index
pandas.core.frame.DataFrame.sort_values
pandas.core.frame.DataFrame.sub
pandas.core.frame.DataFrame.to_dict
pandas.core.frame.DataFrame.transform
pandas.core.frame.DataFrame.transpose
pandas.core.frame.DataFrame.truediv
pandas.core.frame.DataFrame.var
pandas.core.indexes.datetimes.date_range
pandas.core.reshape.concat.concat
pandas.core.reshape.melt.melt
pandas.core.reshape.merge.merge
pandas.core.reshape.pivot.pivot_table
pandas.core.reshape.tile.cut
pandas.core.series.Series.add
pandas.core.series.Series.aggregate
pandas.core.series.Series.all
pandas.core.series.Series.any
pandas.core.series.Series.cumsum
pandas.core.series.Series.div
pandas.core.series.Series.dropna
pandas.core.series.Series.eq
pandas.core.series.Series.ffill
pandas.core.series.Series.fillna
pandas.core.series.Series.floordiv
pandas.core.series.Series.ge
pandas.core.series.Series.gt
pandas.core.series.Series.lt
pandas.core.series.Series.mask
pandas.core.series.Series.mod
pandas.core.series.Series.mul
pandas.core.series.Series.multiply
pandas.core.series.Series.ne
pandas.core.series.Series.pow
pandas.core.series.Series.quantile
pandas.core.series.Series.radd
pandas.core.series.Series.rank
pandas.core.series.Series.rdiv
pandas.core.series.Series.rename
pandas.core.series.Series.replace
pandas.core.series.Series.resample
pandas.core.series.Series.rfloordiv
pandas.core.series.Series.rmod
pandas.core.series.Series.rmul
pandas.core.series.Series.rolling
pandas.core.series.Series.rpow
pandas.core.series.Series.rsub
pandas.core.series.Series.rtruediv
pandas.core.series.Series.sample
pandas.core.series.Series.shift
pandas.core.series.Series.skew
pandas.core.series.Series.sort_index
pandas.core.series.Series.sort_values
pandas.core.series.Series.std
pandas.core.series.Series.sub
pandas.core.series.Series.subtract
pandas.core.series.Series.truediv
pandas.core.series.Series.value_counts
pandas.core.series.Series.var
pandas.core.series.Series.where
pandas.core.tools.numeric.to_numeric

Updated the mapping status for the following Pandas elements, from NotSupported to Direct:

pandas.core.frame.DataFrame.attrs
pandas.core.indexes.base.Index.to_numpy
pandas.core.series.Series.str.len
pandas.io.html.read_html
pandas.io.xml.read_xml
pandas.core.indexes.datetimes.DatetimeIndex.mean
pandas.core.resample.Resampler.indices
pandas.core.resample.Resampler.nunique
pandas.core.series.Series.items
pandas.core.tools.datetimes.to_datetime
pandas.io.sas.sasreader.read_sas
pandas.core.frame.DataFrame.attrs
pandas.core.frame.DataFrame.style
pandas.core.frame.DataFrame.items
pandas.core.groupby.generic.DataFrameGroupBy.head
pandas.core.groupby.generic.DataFrameGroupBy.median
pandas.core.groupby.generic.DataFrameGroupBy.min
pandas.core.groupby.generic.DataFrameGroupBy.nunique
pandas.core.groupby.generic.DataFrameGroupBy.tail
pandas.core.indexes.base.Index.is_boolean
pandas.core.indexes.base.Index.is_floating
pandas.core.indexes.base.Index.is_integer
pandas.core.indexes.base.Index.is_monotonic_decreasing
pandas.core.indexes.base.Index.is_monotonic_increasing
pandas.core.indexes.base.Index.is_numeric
pandas.core.indexes.base.Index.is_object
pandas.core.indexes.base.Index.max
pandas.core.indexes.base.Index.min
pandas.core.indexes.base.Index.name
pandas.core.indexes.base.Index.names
pandas.core.indexes.base.Index.rename
pandas.core.indexes.base.Index.set_names
pandas.core.indexes.datetimes.DatetimeIndex.day_name
pandas.core.indexes.datetimes.DatetimeIndex.month_name
pandas.core.indexes.datetimes.DatetimeIndex.time
pandas.core.indexes.timedeltas.TimedeltaIndex.ceil
pandas.core.indexes.timedeltas.TimedeltaIndex.days
pandas.core.indexes.timedeltas.TimedeltaIndex.floor
pandas.core.indexes.timedeltas.TimedeltaIndex.microseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.nanoseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.round
pandas.core.indexes.timedeltas.TimedeltaIndex.seconds
pandas.core.reshape.pivot.crosstab
pandas.core.series.Series.dt.round
pandas.core.series.Series.dt.time
pandas.core.series.Series.dt.weekday
pandas.core.series.Series.is_monotonic_decreasing
pandas.core.series.Series.is_monotonic_increasing

Updated the mapping status for the following Pandas elements, from NotSupported to Partial:

pandas.core.frame.DataFrame.align
pandas.core.series.Series.align
pandas.core.frame.DataFrame.tz_convert
pandas.core.frame.DataFrame.tz_localize
pandas.core.groupby.generic.DataFrameGroupBy.fillna
pandas.core.groupby.generic.SeriesGroupBy.fillna
pandas.core.indexes.datetimes.bdate_range
pandas.core.indexes.datetimes.DatetimeIndex.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.resample.Resampler.asfreq
pandas.core.resample.Resampler.quantile
pandas.core.series.Series.map
pandas.core.series.Series.tz_convert
pandas.core.series.Series.tz_localize
pandas.core.window.expanding.Expanding.count
pandas.core.window.rolling.Rolling.count
pandas.core.groupby.generic.DataFrameGroupBy.aggregate
pandas.core.groupby.generic.SeriesGroupBy.aggregate
pandas.core.frame.DataFrame.applymap
pandas.core.series.Series.apply
pandas.core.groupby.generic.DataFrameGroupBy.bfill
pandas.core.groupby.generic.DataFrameGroupBy.ffill
pandas.core.groupby.generic.SeriesGroupBy.bfill
pandas.core.groupby.generic.SeriesGroupBy.ffill
pandas.core.frame.DataFrame.backfill
pandas.core.frame.DataFrame.bfill
pandas.core.frame.DataFrame.compare
pandas.core.frame.DataFrame.unstack
pandas.core.frame.DataFrame.asfreq
pandas.core.series.Series.backfill
pandas.core.series.Series.bfill
pandas.core.series.Series.compare
pandas.core.series.Series.unstack
pandas.core.series.Series.asfreq
pandas.core.series.Series.argmax
pandas.core.series.Series.argmin
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.microsecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.nanosecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.day_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_leap_year
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.floor
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.ceil
pandas.core.groupby.generic.DataFrameGroupBy.idxmax
pandas.core.groupby.generic.DataFrameGroupBy.idxmin
pandas.core.groupby.generic.DataFrameGroupBy.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.tools.timedeltas.to_timedelta

已知问题¶

此版本包含一个问题，导致无法在此版本中进行示例项目转换， 这将在下一个版本中修复

Version 2.4.3 (Jan 9, 2025)¶

应用程序和 CLI 版本 2.4.3¶

Desktop App¶

崩溃报告模式中新增故障排除指南链接。

Included SMA Core Versions¶

Snowpark Conversion Core 4.15.0

已添加¶

在 ConversionStatusPySpark.csv 文件中将以下 PySpark 元素添加为 NotSupported:
- pyspark.sql.streaming.readwriter.DataStreamReader.table
- pyspark.sql.streaming.readwriter.DataStreamReader.schema
- pyspark.sql.streaming.readwriter.DataStreamReader.options
- pyspark.sql.streaming.readwriter.DataStreamReader.option
- pyspark.sql.streaming.readwriter.DataStreamReader.load
- pyspark.sql.streaming.readwriter.DataStreamReader.format
- pyspark.sql.streaming.query.StreamingQuery.awaitTermination
- pyspark.sql.streaming.readwriter.DataStreamWriter.partitionBy
- pyspark.sql.streaming.readwriter.DataStreamWriter.toTable
- pyspark.sql.streaming.readwriter.DataStreamWriter.trigger
- pyspark.sql.streaming.readwriter.DataStreamWriter.queryName
- pyspark.sql.streaming.readwriter.DataStreamWriter.outputMode
- pyspark.sql.streaming.readwriter.DataStreamWriter.format
- pyspark.sql.streaming.readwriter.DataStreamWriter.option
- pyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatch
- pyspark.sql.streaming.readwriter.DataStreamWriter.start

更改¶

更新了 Hive SQL EWIs 格式。
- SPRKHVSQL1001
- SPRKHVSQL1002
- SPRKHVSQL1003
- SPRKHVSQL1004
- SPRKHVSQL1005
- SPRKHVSQL1006
更新了 Spark SQL EWIs 格式。
- SPRKSPSQL1001
- SPRKSPSQL1002
- SPRKSPSQL1003
- SPRKSPSQL1004
- SPRKSPSQL1005
- SPRKSPSQL1006

已修复¶

修复了导致该工具无法识别某些 PySpark 元素的错误。
修复了 ThirdParty 标识的调用和 ThirdParty 导入的调用数量不匹配的问题。

Version 2.4.2 (Dec 13, 2024)¶

Application & CLI Version 2.4.2¶

Included SMA Core Versions¶

Snowpark Conversion Core 4.14.0

新增 added¶

在 ConversionStatusPySpark.csv 中添加了以下 Spark 元素：
- pyspark.broadcast.Broadcast.value
- pyspark.conf.SparkConf.getAll
- pyspark.conf.SparkConf.setAll
- pyspark.conf.SparkConf.setMaster
- pyspark.context.SparkContext.addFile
- pyspark.context.SparkContext.addPyFile
- pyspark.context.SparkContext.binaryFiles
- pyspark.context.SparkContext.setSystemProperty
- pyspark.context.SparkContext.version
- pyspark.files.SparkFiles
- pyspark.files.SparkFiles.get
- pyspark.rdd.RDD.count
- pyspark.rdd.RDD.distinct
- pyspark.rdd.RDD.reduceByKey
- pyspark.rdd.RDD.saveAsTextFile
- pyspark.rdd.RDD.take
- pyspark.rdd.RDD.zipWithIndex
- pyspark.sql.context.SQLContext.udf
- pyspark.sql.types.StructType.simpleString

更改¶

更新了 Pandas EWIs 的文档，PNDSPY1001、PNDSPY1002 和 PNDSPY1003 SPRKSCL1137，使其与标准化格式保持一致，确保了所有 EWIs 的一致性和清晰度。
更新了以下 Scala EWIs 的文档：SPRKSCL1106 和 SPRKSCL1107。与标准化格式保持一致，从而确保所有 EWIs 的一致性和清晰度。

已修复¶

已修复导致 UserDefined 符号在第三方使用情况清单中显示的错误。

Version 2.4.1 (Dec 4, 2024)¶

Application & CLI Version 2.4.1¶

Included SMA Core Versions¶

Snowpark Conversion Core 4.13.1

Command Line Interface¶

已更改

为输出文件夹添加了时间戳。

Snowpark Conversion Core 4.13.1¶

已添加¶

在库映射表中添加了“Source Language”列
在 DetailedReport.docx 的 Pandas API 摘要表中添加了 Others 作为新类别

更改¶

更新了 Python EWI SPRKPY1058 的文档。
更新了 pandas EWI PNDSPY1002 的消息，以显示相关的 Pandas 元素。
更新了我们创建 .csv 报告的方式，现在，在第二次运行后其会被覆盖。

已修复¶

修复了导致在输出中无法生成笔记本文件的错误。
修复了 pyspark.sql.conf.RuntimeConfig 中的 get 和 set 方法的替换器，替换器现可匹配正确的全名。
修复了查询标签版本不正确的问题。
修复了 UserDefined 软件包报告为 ThirdPartyLib 的问题。

Version 2.3.1 (Nov 14, 2024)¶

Application & CLI Version 2.3.1¶

Included SMA Core Versions¶

Snowpark Conversion Core 4.12.0

Desktop App¶

已修复

修复了 --sql 选项中区分大小写的问题。

已移除

从 show-ac 消息中删除了平台名称。

Snowpark Conversion Core 4.12.0¶

已添加¶

新增对 Snowpark Python 1.23.0 和 1.24.0 的支持。
为 pyspark.sql.dataframe.DataFrame.writeTo 函数添加了新的 EWI。现在，该函数的所有使用都将采用 EWI SPRKPY1087。

更改¶

将 Scala EWIs 的文档从 SPRKSCL1137 更新为 SPRKSCL1156，使其与标准化格式保持一致，确保了所有 EWIs 的一致性和清晰度。
将 Scala EWIs 的文档从 SPRKSCL1117 更新为 SPRKSCL1136，使其与标准化格式保持一致，确保了所有 EWIs 的一致性和清晰度。
更新了针对以下 EWIs 显示的消息：
- SPRKPY1082
- SPRKPY1083
将 Scala EWIs 的文档从 SPRKSCL1100 更新为 SPRKSCL1105，从 SPRKSCL1108 更新为 SPRKSCL1116，从 SPRKSCL1157 更新为 SPRKSCL1175，使其与标准化格式保持一致，确保了所有 EWIs 的一致性和清晰度。
使用 EWI 将以下 PySpark 元素的映射状态从 NotSupported 更新为 Direct：
- pyspark.sql.readwriter.DataFrameWriter.option => snowflake.snowpark.DataFrameWriter.option：现在，该函数的所有使用都将采用 EWI SPRKPY1088
- pyspark.sql.readwriter.DataFrameWriter.options => snowflake.snowpark.DataFrameWriter.options：现在，该函数的所有使用都将采用 EWI SPRKPY1089
将以下 PySpark 元素的映射状态从 Workaround 更新为 Rename：
- pyspark.sql.readwriter.DataFrameWriter.partitionBy => snowflake.snowpark.DataFrameWriter.partition_by
更新了 EWI 文档：SPRKSCL1000、SPRKSCL1001、SPRKSCL1002、SPRKSCL1100、SPRKSCL1101、SPRKSCL1102、SPRKSCL1103、SPRKSCL1104、SPRKSCL1105。

Removed¶

从转换状态中移除了 pyspark.sql.dataframe.DataFrameStatFunctions.writeTo，此元素已不再存在。

已弃用¶

已弃用以下 EWI 代码：
- SPRKPY1081
- SPRKPY1084

Version 2.3.0 (Oct 30, 2024)¶

应用程序和 CLI 版本 2.3.0¶

Snowpark Conversion Core 4.11.0

Snowpark Conversion Core 4.11.0¶

已添加¶

在 Issues.csv 文件中添加了一个名为 Url 的新列，该列重定向到相应的 EWI 文档。
为以下 Spark 元素添加了新的 EWIs：
- [SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
- [SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
- [SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
- [SPRKPY1085] pyspark.ml.feature.VectorAssembler
- [SPRKPY1086] pyspark.ml.linalg.VectorUDT
新增 38 个 Pandas 元素：
- pandas.core.frame.DataFrame.select
- andas.core.frame.DataFrame.str
- pandas.core.frame.DataFrame.str.replace
- pandas.core.frame.DataFrame.str.upper
- pandas.core.frame.DataFrame.to_list
- pandas.core.frame.DataFrame.tolist
- pandas.core.frame.DataFrame.unique
- pandas.core.frame.DataFrame.values.tolist
- pandas.core.frame.DataFrame.withColumn
- pandas.core.groupby.generic._SeriesGroupByScalar
- pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
- pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
- pandas.core.indexes.datetimes.DatetimeIndex.year
- pandas.core.series.Series.columns
- pandas.core.tools.datetimes.to_datetime.date
- pandas.core.tools.datetimes.to_datetime.dt.strftime
- pandas.core.tools.datetimes.to_datetime.strftime
- pandas.io.parsers.readers.TextFileReader.apply
- pandas.io.parsers.readers.TextFileReader.astype
- pandas.io.parsers.readers.TextFileReader.columns
- pandas.io.parsers.readers.TextFileReader.copy
- pandas.io.parsers.readers.TextFileReader.drop
- pandas.io.parsers.readers.TextFileReader.drop_duplicates
- pandas.io.parsers.readers.TextFileReader.fillna
- pandas.io.parsers.readers.TextFileReader.groupby
- pandas.io.parsers.readers.TextFileReader.head
- pandas.io.parsers.readers.TextFileReader.iloc
- pandas.io.parsers.readers.TextFileReader.isin
- pandas.io.parsers.readers.TextFileReader.iterrows
- pandas.io.parsers.readers.TextFileReader.loc
- pandas.io.parsers.readers.TextFileReader.merge
- pandas.io.parsers.readers.TextFileReader.rename
- pandas.io.parsers.readers.TextFileReader.shape
- pandas.io.parsers.readers.TextFileReader.to_csv
- pandas.io.parsers.readers.TextFileReader.to_excel
- pandas.io.parsers.readers.TextFileReader.unique
- pandas.io.parsers.readers.TextFileReader.values
- pandas.tseries.offsets

Version 2.2.3 (Oct 24, 2024)¶

Application Version 2.2.3¶

Included SMA Core Versions¶

Snowpark Conversion Core 4.10.0

Desktop App¶

已修复¶

修复了导致 SMA 在 Windows 版本菜单栏中显示 SnowConvert 而非 Snowpark Migration Accelerator 标签的错误。
修复了对于 macOS 中的 .config 目录和 Windows 中的 AppData 目录没有读写权限时，导致 SMA 崩溃的错误。

Command Line Interface¶

已更改

将 CLI 可执行文件名从 snowct 重命名为 sma。
移除了源语言参数，因此您不需要再指定运行的是 Python 还是 Scala 评估/转换。
通过添加以下新实参扩展了 CLI 支持的命令行参数：
- --enableJupyter | -j：该标志用于指示是否已启用从 Databricks 笔记本到 Jupyter 的转换。
- --sql | -f：在检测 SQL 命令时使用的数据库引擎语法。
- --customerEmail | -e：配置客户电子邮件地址。
- --customerCompany | -c：配置客户的公司。
- --projectName | -p：配置客户项目。
更新了部分文本，以体现应用程序的正确名称，确保所有消息的一致性和清晰度。
更新了应用程序使用条款。
更新并扩展了 CLI 文档，以体现最新功能、增强和更改。
更新了在继续执行 SMA 之前显示的文本，以作出改进
更新了 CLI，在提示用户确认时接受 “是” 作为有效实参。
指定实参 -y 或 --yes，允许 CLI 在不等待用户交互的情况下继续执行。
更新了 --sql 实参的帮助信息，以显示该实参预期应收到的值。

Snowpark Conversion Core 版本 4.10.0¶

已添加¶

为 pyspark.sql.readwriter.DataFrameWriter.partitionBy 函数添加了新 EWI。现在，该函数的所有使用都将采用 EWI SPRKPY1081。
在 ImportUsagesInventory.csv 文件中添加了一个名为 Technology 的新列。

更改¶

更新了第三方库就绪度分数，同时考虑了 Unknown 库。
更新了 AssessmentFiles.zip 文件，使其包含 .json 文件，而非 .pam 文件。
改进了从 CSV 到 JSON 的转换机制，以提高清单处理性能。
改进了以下 EWIs 的文档：
- SPRKPY1029
- SPRKPY1054
- SPRKPY1055
- SPRKPY1063
- SPRKPY1075
- SPRKPY1076
将以下 Spark Scala 元素的映射状态从 Direct 更新为 Rename。
- org.apache.spark.sql.functions.shiftLeft => com.snowflake.snowpark.functions.shiftleft
- org.apache.spark.sql.functions.shiftRight => com.snowflake.snowpark.functions.shiftright
将以下 Spark Scala 元素的映射状态从 Not Supported 更新为 Direct。
- org.apache.spark.sql.functions.shiftleft => com.snowflake.snowpark.functions.shiftleft
- org.apache.spark.sql.functions.shiftright => com.snowflake.snowpark.functions.shiftright

已修复¶

修复了导致 SMA 错误地填充 ImportUsagesInventory.csv 文件的 Origin 列的错误。
修复了导致 SMA 在 ImportUsagesInventory.csv 文件和 DetailedReport.docx 文件中未将导入的库 io、json、logging 和 unittest 归类为 Python 内置导入的错误。

Version 2.2.2 (Oct 11, 2024)¶

应用程序版本 2.2.2¶

功能更新包括：

Snowpark Conversion Core 4.8.0

Snowpark Conversion Core 版本 4.8.0¶

已添加¶

添加了 EwiCatalog.csv 和 .md 文件来重新组织文档
添加了 pyspark.sql.functions.ln Direct 的映射状态。
为 pyspark.context.SparkContext.getOrCreate 添加了转换
- 请查看 EWI SPRKPY1080，以了解更多详情。
添加了对 SymbolTable 的改进，用于推断函数中参数的类型。
新增的 SymbolTable 支持静态方法，不会假设第一个参数是 self。
为缺失的 EWIs 添加了文档
- SPRKHVSQL1005
- SPRKHVSQL1006
- SPRKSPSQL1005
- SPRKSPSQL1006
- SPRKSCL1002
- SPRKSCL1170
- SPRKSCL1171
- SPRKPY1057
- SPRKPY1058
- SPRKPY1059
- SPRKPY1060
- SPRKPY1061
- SPRKPY1064
- SPRKPY1065
- SPRKPY1066
- SPRKPY1067
- SPRKPY1069
- SPRKPY1070
- SPRKPY1077
- SPRKPY1078
- SPRKPY1079
- SPRKPY1101

更改¶

更新了以下内容的映射状态：
- pyspark.sql.functions.array_remove 从 NotSupported 更新为 Direct。

已修复¶

修复了“Detail Report”中的“Code File Sizing”表，排除了 .sql 和 .hql 文件，并在表中添加了“Extra Large”行。
修复了在 Python 上将 SparkSession 定义为多行时，缺少 update_query_tag 的问题。
修复了在 Scala 上将 SparkSession 定义为多行时，缺少 update_query_tag 的问题。
修复了某些存在解析错误的 SQL 语句中缺少 EWI SPRKHVSQL1001 的问题。
修复了在字符串字面量中保留换行符值的问题
修复了在“File Type Summary”表中显示的代码总行数
修复了成功识别文件时“Parsing Score”显示为 0 的问题
修复了 Databricks Magic SQL 单元格清单中的 LOC 计数问题

Version 2.2.0 (Sep 26, 2024)¶

Application Version 2.2.0¶

功能更新包括：

Snowpark Conversion Core 4.6.0

Snowpark Conversion Core 版本 4.6.0¶

已添加¶

为 pyspark.sql.readwriter.DataFrameReader.parquet 添加转换。
在 pyspark.sql.readwriter.DataFrameReader.option 是 Parquet 方法时，为其添加转换。

更改¶

更新了以下内容的映射状态：
- pyspark.sql.types.StructType.fields 从 NotSupported 更新到 Direct。
- pyspark.sql.types.StructType.names 从 NotSupported 更新到 Direct。
- pyspark.context.SparkContext.setLogLevel 从 Workaround 更新到 Transformation。
  - 更多详情，请参阅 EWIs SPRKPY1078 和 SPRKPY1079
- org.apache.spark.sql.functions.round 从 WorkAround 更新到 Direct。
- org.apache.spark.sql.functions.udf 从 NotDefined 更新到 Transformation。
  - 更多详情，请参阅 EWIs SPRKSCL1174 和 SPRKSCL1175
将以下 Spark 元素的映射状态从 DirectHelper 更新为 Direct：
- org.apache.spark.sql.functions.hex
- org.apache.spark.sql.functions.unhex
- org.apache.spark.sql.functions.shiftleft
- org.apache.spark.sql.functions.shiftright
- org.apache.spark.sql.functions.reverse
- org.apache.spark.sql.functions.isnull
- org.apache.spark.sql.functions.unix_timestamp
- org.apache.spark.sql.functions.randn
- org.apache.spark.sql.functions.signum
- org.apache.spark.sql.functions.sign
- org.apache.spark.sql.functions.collect_list
- org.apache.spark.sql.functions.log10
- org.apache.spark.sql.functions.log1p
- org.apache.spark.sql.functions.base64
- org.apache.spark.sql.functions.unbase64
- org.apache.spark.sql.functions.regexp_extract
- org.apache.spark.sql.functions.expr
- org.apache.spark.sql.functions.date_format
- org.apache.spark.sql.functions.desc
- org.apache.spark.sql.functions.asc
- org.apache.spark.sql.functions.size
- org.apache.spark.sql.functions.locate
- org.apache.spark.sql.functions.ntile

已修复¶

修复了 Pandas Api 总数百分比中显示的值
修复了 DetailReport 中 ImportCalls 表的总百分比

已弃用¶

弃用了以下 EWI 代码：
- SPRKSCL1115

Version 2.1.7 (Sep 12, 2024)¶

应用程序版本 2.1.7¶

功能更新包括：

Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2

Snowpark Conversion Core 版本 4.5.7¶

Hotfixed¶

修复了在没有使用数据时，在“Spark Usages Summaries”中添加总行数的问题
升级了 Python 程序集，现在 Version=1.3.111
- 解析多行实参中的尾随逗号

Snowpark Conversion Core 版本 4.5.2¶

已添加¶

为 pyspark.sql.readwriter.DataFrameReader.option 添加了转换：
- 在链来自 CSV 方法调用时。
- 在链来自 JSON 方法调用时。
为 pyspark.sql.readwriter.DataFrameReader.json 添加了转换。

更改¶

对传递给 Python/Scala 函数的 SQL 字符串执行 SMA
- 在 Scala/Python 中创建 AST，以发出临时 SQL 单元
- 创建 SqlEmbeddedUsages.csv 清单
- 弃用 SqlStatementsInventroy.csv 和 SqlExtractionInventory.csv
- 无法处理 SQL 字面量时集成 EWI
- 创建新任务来处理嵌入 SQL 的代码
- 在 Python 中收集 SqlEmbeddedUsages.csv 清单的信息
- 在 Python 中将 SQL 转换后的代码替换为字面量
- 在实施测试用例之后对其进行更新
- 在 SqlEmbeddedUsages 清单中创建用于遥测的表和视图
- 在 Scala 中为 SqlEmbeddedUsages.csv 报告收集信息
- 在 Scala 中将 SQL 转换后的代码替换为字面量
- 检查嵌入式 SQL 报告的行号顺序
在 SqlFunctionsInfo.csv 中填入了为 SparkSQL 和 HiveSQL 记录的 SQL 函数
更新了以下各项的映射状态：
- org.apache.spark.sql。SparkSession.sparkContext 从NotSupported转型到转换。
- org.apache.spark.sql.Builder.config 从 NotSupported 更新到 Transformation。通过此新映射状态，SMA 将从源代码中删除该函数所有相关调用。

Version 2.1.6 (Sep 5, 2024)¶

应用程序版本 2.1.6¶

Snowpark Engines Core 版本 4.5.1 的修补程序更改

Spark Conversion Core 版本 4.5.1¶

修补程序

添加了一种机制，可在导出的 Databricks 笔记本中转换由 SMA 生成的临时 Databricks 笔记本

Version 2.1.5 (Aug 29, 2024)¶

应用程序版本 2.1.5¶

功能更新包括：

更新了 Spark Conversion Core：4.3.2

Spark Conversion Core 版本 4.3.2¶

已添加¶

添加了一种机制（通过装饰），用于获取笔记本单元格中识别出的元素的行和列
为 pyspark.sql.functions.from_json 添加了 EWI。
为 pyspark.sql.readwriter.DataFrameReader.csv 添加了转换。
为 Scala 文件启用了查询标签机制。
添加了代码分析分数及详细报告的额外链接。
在 InputFilesInventory.csv 中添加了名为 OriginFilePath 的一列

更改¶

将 pyspark.sql.functions.from_json 的映射状态从“Not Supported”更新为 Transformation。
将以下 Spark 元素的映射状态从“Workaround”更新为“Direct”：
- org.apache.spark.sql.functions.countDistinct
- org.apache.spark.sql.functions.max
- org.apache.spark.sql.functions.min
- org.apache.spark.sql.functions.mean

已弃用¶

已弃用以下 EWI 代码：
- SPRKSCL1135
- SPRKSCL1136
- SPRKSCL1153
- SPRKSCL1155

已修复¶

修复了导致 Spark API 分数计算不正确的错误。
修复了避免将 SQL 空文件或含注释文件复制到输出文件夹中的错误。
修复了 DetailedReport 中的一个错误，该错误导致笔记本统计数据 LOC 和单元格计数不准确。

Version 2.1.2 (Aug 14, 2024)¶

应用程序版本 2.1.2¶

功能更新包括：

更新了 Spark Conversion Core：4.2.0

Spark Conversion Core 版本 4.2.0¶

已添加¶

将技术列添加到 SparkUsagesInventory
添加了一个用于未定义的 SQL 元素的 EWI。
添加了 SqlFunctions 清单
收集 SqlFunctions 清单的信息

更改¶

引擎现在可以处理和打印部分地进行了解析的 Python 文件，而非保留原始文件而不做任何修改。
出现解析错误的 Python 笔记本单元格也会被处理和打印。

已修复¶

修复了 pandas.core.indexes.datetimes.DatetimeIndex.strftime 被错误地报告的问题。
修复了 SQL 就绪度分数与“SQL Usages by Support Status”之间不匹配的问题。
修复了导致 SMA 报告 pandas.core.series.Series.empty 映射状态不正确的错误。
修复了 DetailedReport.docx 中的“Spark API Usages Ready for Conversion”与 Assesment.json 中的 UsagesReadyForConversion 行之间不匹配的问题。

Version 2.1.1 (Aug 8, 2024)¶

应用程序版本 2.1.1¶

功能更新包括：

更新了 Spark Conversion Core：4.1.0

Spark Conversion Core 版本 4.1.0¶

已添加¶

在 AssessmentReport.json 文件中添加了以下信息
- 第三方库就绪度分数。
- 已确定的第三方库调用次数。
- Snowpark 内支持的第三方库调用次数。
- 与第三方就绪度分数、Spark API 就绪度分数和 SQL 就绪度分数关联的颜色代码。
在 Spark 创建表中，对 SqlSimpleDataType 进行了转换。
添加了 direct 形式的 pyspark.sql.functions.get 映射。
添加了 direct 形式的 pyspark.sql.functions.to_varchar 映射。
作为统一后更改的一部分，此工具现在会在引擎中生成执行信息文件。
添加了 pyspark.sql.SparkSession.builder.appName 的替换器。

更改¶

更新了以下 Spark 元素的映射状态
- 从“Not Supported”更新为“Direct”映射：
  - pyspark.sql.functions.sign
  - pyspark.sql.functions.signum
更改了笔记本单元格清单报告，以指明“Element”列中每个单元格的内容种类
添加了 SCALA_READINESS_SCORE 列，该列报告的就绪度分数仅与 Scala 文件中对 Spark API 的引用有关。
部分支持在 ALTER TABLE 和 ALTER VIEW 中转换表属性
在 Spark 创建表中，将 SqlSimpleDataType 节点的转换状态从“Pending”更新为“Transformation”
SMA 支持的 Snowpark Scala API 版本从 1.7.0 更新为 1.12.1：
- 更新了以下内容的映射状态：
  - org.apache.spark.sql.SparkSession.getOrCreate 从 Rename 更新为 Direct
  - org.apache.spark.sql.functions.sum 从“Workaround”更新为“Direct”
SMA 支持的 Snowpark Python API 版本从 1.15.0 更新为 1.20.0：
- 更新了以下内容的映射状态：
  - pyspark.sql.functions.arrays_zip 从“Not Supported”更新为“Direct”
更新了以下 Pandas 元素的映射状态：
- Direct 映射：
  - pandas.core.frame.DataFrame.any
  - pandas.core.frame.DataFrame.applymap
更新了以下 Pandas 元素的映射状态：
- 从“Not Supported”更新为“Direct”映射：
  - pandas.core.frame.DataFrame.groupby
  - pandas.core.frame.DataFrame.index
  - pandas.core.frame.DataFrame.T
  - pandas.core.frame.DataFrame.to_dict
- 从“Not Supported”更新为“Rename”映射：
  - pandas.core.frame.DataFrame.map
更新了以下 Pandas 元素的映射状态：
- Direct 映射：
  - pandas.core.frame.DataFrame.where
  - pandas.core.groupby.generic.SeriesGroupBy.agg
  - pandas.core.groupby.generic.SeriesGroupBy.aggregate
  - pandas.core.groupby.generic.DataFrameGroupBy.agg
  - pandas.core.groupby.generic.DataFrameGroupBy.aggregate
  - pandas.core.groupby.generic.DataFrameGroupBy.apply
- “Not Supported”映射：
  - pandas.core.frame.DataFrame.to_parquet
  - pandas.core.generic.NDFrame.to_csv
  - pandas.core.generic.NDFrame.to_excel
  - pandas.core.generic.NDFrame.to_sql
更新了以下 Pandas 元素的映射状态：
- Direct 映射：
  - pandas.core.series.Series.empty
  - pandas.core.series.Series.apply
  - pandas.core.reshape.tile.qcut
- 使用 EWI 的 Direct 映射：
  - pandas.core.series.Series.fillna
  - pandas.core.series.Series.astype
  - pandas.core.reshape.melt.melt
  - pandas.core.reshape.tile.cut
  - pandas.core.reshape.pivot.pivot_table
更新了以下 Pandas 元素的映射状态：
- Direct 映射：
  - pandas.core.series.Series.dt
  - pandas.core.series.Series.groupby
  - pandas.core.series.Series.loc
  - pandas.core.series.Series.shape
  - pandas.core.tools.datetimes.to_datetime
  - pandas.io.excel._base.ExcelFile
- “Not Supported”映射：
  - pandas.core.series.Series.dt.strftime
更新了以下 Pandas 元素的映射状态：
- 从“Not Supported”更新为“Direct”映射：
  - pandas.io.parquet.read_parquet
  - pandas.io.parsers.readers.read_csv
更新了以下 Pandas 元素的映射状态：
- 从“Not Supported”更新为“Direct”映射：
  - pandas.io.pickle.read_pickle
  - pandas.io.sql.read_sql
  - pandas.io.sql.read_sql_query
更新了“了解 SQL 就绪度分数”的描述。
更新了 PyProgramCollector 以收集包，并使用来自 Python 源代码的数据填充当前的包清单。
将 pyspark.sql.SparkSession.builder.appName 的映射状态从“Rename”更新为“Transformation”。
删除了以下 Scala 集成测试：
- AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentMode
- AssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScala
- AssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
将 pandas.core.generic.NDFrame.shape 的映射状态从“Not Supported”更新为“Direct”。
将 pandas.core.series 的映射状态从“Not Supported”更新为“Direct”。

已弃用¶

弃用了 EWI 代码 SPRKSCL1160，因为 org.apache.spark.sql.functions.sum 现在是“Direct”映射。

已修复¶

修复了在 Jupyter 笔记本单元格中不支持不带实参的 Custom Magics 的错误。
修复了在出现解析错误时，在 issues.csv 报告中错误地生成 EWIs 的问题。
修复了导致 SMA 无法将 Databricks 导出的笔记本作为 Databricks 笔记本处理的错误。
修复了在处理包对象内创建的声明类型名称冲突时，出现的堆栈溢出错误。
修复了对涉及泛型的复杂 lambda 类型名称的处理，例如，def func[X,Y](f:(Map[Option[X], Y] => Map[Y, X]))...
修复了一个 bug，该 bug 可导致 SMA 在尚未识别的 Pandas 元素中添加 PySpark EWI 代码，而非 Pandas EWI 代码。
修复了详细报告模板中的一个错字：将列从“Percentage of all Python Files”重命名为“Percentage of all files”。
修复了错误地报告 pandas.core.series.Series.shape 的 bug。