modin.pandas.Series.map

Series.map(arg, na_action=None) Series[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.41.0/.tox/docs/lib/python3.9/site-packages/modin/pandas/series.py#L1451-L1475)

Map values of Series according to an input mapping or function.

Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.

Parameters:
  • arg (function, collections.abc.Mapping subclass or Series) – Mapping correspondence. Only function is currently supported by Snowpark pandas.

  • na_action ({None, 'ignore'}, default None) – If ‘ignore’, propagate NULL values, without passing them to the mapping correspondence. Note that, it will not bypass NaN values in a FLOAT column in Snowflake. ‘ignore’ is currently not supported by Snowpark pandas.

Returns:

Same index as caller.

Return type:

Series

See also

Series.apply : For applying more complex functions on a Series.

DataFrame.apply : Apply a function row-/column-wise.

DataFrame.applymap : Apply a function elementwise on a whole DataFrame.

Notes

When arg is a dictionary, values in Series that are not in the dictionary (as keys) are converted to NaN. However, if the dictionary is a dict subclass that defines __missing__ (i.e. provides a method for default values), then this default is used rather than NaN.

Examples

>>> s = pd.Series(['cat', 'dog', None, 'rabbit'])
>>> s
0       cat
1       dog
2      None
3    rabbit
dtype: object
Copy

map accepts a dict or a Series. Values that are not found in the dict are converted to NaN, unless the dict has a default value (e.g. defaultdict):

>>> s.map({'cat': 'kitten', 'dog': 'puppy'})
0    kitten
1     puppy
2      None
3      None
dtype: object
Copy

It also accepts a function:

>>> s.map('I am a {}'.format)
0       I am a cat
1       I am a dog
2      I am a <NA>
3    I am a rabbit
dtype: object
Copy

To avoid applying the function to missing values (and keep them as NaN) na_action='ignore' can be used (Currently not supported by Snowpark pandas):

>>> s.map('I am a {}'.format, na_action='ignore')  
0       I am a cat
1       I am a dog
2             None
3    I am a rabbit
dtype: object
Copy

Note that in the above example, the missing value in Snowflake is NULL, it is mapped to None in a string/object column.

Snowpark pandas does not yet support dict subclasses other than collections.defaultdict that define a __missing__ method.

To generate a permanent UDF, pass a dictionary as the snowflake_udf_params argument to apply. The following example generates a permanent UDF named “permanent_double”:

>>> session.sql("CREATE STAGE sample_upload_stage").collect()  
>>> def double(x: str) -> str:  
...     return x * 2  
...
>>> s.map(double, snowflake_udf_params={"name": "permanent_double", "stage_location": "@sample_upload_stage"})  
0          catcat
1          dogdog
2            None
3    rabbitrabbit
dtype: object
Copy

You may also pass “replace” and “if_not_exists” in the dictionary to overwrite or re-use existing UDTFs.

With the “replace” flag:

>>> df.apply(double, snowflake_udf_params={  
...     "name": "permanent_double",
...     "stage_location": "@sample_upload_stage",
...     "replace": True,
... })
Copy

With the “if_not_exists” flag:

>>> df.apply(double, snowflake_udf_params={  
...     "name": "permanent_double",
...     "stage_location": "@sample_upload_stage",
...     "if_not_exists": True,
... })
Copy

Note that Snowpark pandas may still attempt to upload a new UDTF even when “if_not_exists” is passed; the generated SQL will just contain a CREATE FUNCTION IF NOT EXISTS query instead. Subsequent calls to apply within the same session may skip this query.

Passing the immutable keyword creates an immutable UDTF, which assumes that the UDTF will return the same result for the same inputs.

>>> df.apply(double, snowflake_udf_params={  
...     "name": "permanent_double",
...     "stage_location": "@sample_upload_stage",
...     "replace": True,
...     "immutable": True,
... })
Copy
Language: English