snowflake.snowpark.DataFrameNaFunctions.fill¶
- DataFrameNaFunctions.fill(value: Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, NaTType, float64, list, tuple, dict, Dict[str, Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, NaTType, float64, list, tuple, dict]]], subset: Optional[Union[str, Iterable[str]]] = None) DataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe_na_functions.py#L218-L392)¶
Returns a new DataFrame that replaces all null and NaN values in the specified columns with the values provided.
- Parameters:
value – A scalar value or a
dict
that associates the names of columns with the values that should be used to replace null and NaN values in those columns. Ifvalue
is adict
,subset
is ignored. Ifvalue
is an emptydict
, the method returns the original DataFrame.subset –
A list of the names of columns to check for null and NaN values. In each case:
If
subset
is not provided orNone
, all columns will be included.If
subset
is empty, the method returns the original DataFrame.
Examples:
>>> df = session.create_dataframe([[1.0, 1], [float('nan'), 2], [None, 3], [4.0, None], [float('nan'), None]]).to_df("a", "b") >>> # fill null and NaN values in all columns >>> df.na.fill(3.14).show() --------------- |"A" |"B" | --------------- |1.0 |1 | |3.14 |2 | |3.14 |3 | |4.0 |NULL | |3.14 |NULL | --------------- >>> # fill null and NaN values in column "a" >>> df.na.fill(3.14, subset="a").show() --------------- |"A" |"B" | --------------- |1.0 |1 | |3.14 |2 | |3.14 |3 | |4.0 |NULL | |3.14 |NULL | --------------- >>> # fill null and NaN values in column "a" >>> df.na.fill({"a": 3.14}).show() --------------- |"A" |"B" | --------------- |1.0 |1 | |3.14 |2 | |3.14 |3 | |4.0 |NULL | |3.14 |NULL | --------------- >>> # fill null and NaN values in column "a" and "b" >>> df.na.fill({"a": 3.14, "b": 15}).show() -------------- |"A" |"B" | -------------- |1.0 |1 | |3.14 |2 | |3.14 |3 | |4.0 |15 | |3.14 |15 | -------------- >>> df2 = session.create_dataframe([[1.0, True], [2.0, False], [3.0, False], [None, None]]).to_df("a", "b") >>> df2.na.fill(True).show() ---------------- |"A" |"B" | ---------------- |1.0 |True | |2.0 |False | |3.0 |False | |NULL |True | ----------------
Note
If the type of a given value in
value
doesn’t match the column data type (e.g. afloat
forStringType
column), this replacement will be skipped in this column. Especially,int
can be filled in a column withFloatType
orDoubleType
, butfloat
cannot filled in a column withIntegerType
orLongType
.
See also