snowflake.snowpark.DataFrameNaFunctions.drop¶
- DataFrameNaFunctions.drop(how: str = 'any', thresh: Optional[int] = None, subset: Optional[Union[str, Iterable[str]]] = None) DataFrame[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.23.0/src/snowflake/snowpark/dataframe_na_functions.py#L68-L216)¶
- Returns a new DataFrame that excludes all rows containing fewer than a specified number of non-null and non-NaN values in the specified columns. - Parameters:
- how – An - strwith value either ‘any’ or ‘all’. If ‘any’, drop a row if it contains any nulls. If ‘all’, drop a row only if all its values are null. The default value is ‘any’. If- threshis provided,- howwill be ignored.
- thresh – - The minimum number of non-null and non-NaN values that should be in the specified columns in order for the row to be included. It overwrites - how. In each case:- If - threshis not provided or- None, the length of- subsetwill be used when- howis ‘any’ and 1 will be used when- howis ‘all’.
- If - threshis greater than the number of the specified columns, the method returns an empty DataFrame.
- If - threshis less than 1, the method returns the original DataFrame.
 
- subset – - A list of the names of columns to check for null and NaN values. In each case: - If - subsetis not provided or- None, all columns will be included.
- If - subsetis empty, the method returns the original DataFrame.
 
 
 - Examples: - >>> df = session.create_dataframe([[1.0, 1], [float('nan'), 2], [None, 3], [4.0, None], [float('nan'), None]]).to_df("a", "b") >>> # drop a row if it contains any nulls, with checking all columns >>> df.na.drop().show() ------------- |"A" |"B" | ------------- |1.0 |1 | ------------- >>> # drop a row only if all its values are null, with checking all columns >>> df.na.drop(how='all').show() --------------- |"A" |"B" | --------------- |1.0 |1 | |nan |2 | |NULL |3 | |4.0 |NULL | --------------- >>> # drop a row if it contains at least one non-null and non-NaN values, with checking all columns >>> df.na.drop(thresh=1).show() --------------- |"A" |"B" | --------------- |1.0 |1 | |nan |2 | |NULL |3 | |4.0 |NULL | --------------- >>> # drop a row if it contains any nulls, with checking column "a" >>> df.na.drop(subset=["a"]).show() -------------- |"A" |"B" | -------------- |1.0 |1 | |4.0 |NULL | -------------- >>> df.na.drop(subset="a").show() -------------- |"A" |"B" | -------------- |1.0 |1 | |4.0 |NULL | -------------- - See also