snowflake.snowpark.DataFrame.pivot¶
- DataFrame.pivot(pivot_col: Union[Column, str], values: Optional[Union[Iterable[Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, NaTType, float64, list, tuple, dict]], DataFrame]] = None, default_on_null: Union[None, bool, int, float, str, bytearray, Decimal, date, datetime, time, bytes, NaTType, float64, list, tuple, dict] = None) RelationalGroupedDataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe.py#L1620-L1694)¶
Rotates this DataFrame by turning the unique values from one column in the input expression into multiple columns and aggregating results where required on any remaining column values.
Only one aggregate is supported with pivot.
Example:
>>> create_result = session.sql('''create or replace temp table monthly_sales(empid int, amount int, month text) ... as select * from values ... (1, 10000, 'JAN'), ... (1, 400, 'JAN'), ... (2, 4500, 'JAN'), ... (2, 35000, 'JAN'), ... (1, 5000, 'FEB'), ... (1, 3000, 'FEB'), ... (2, 200, 'FEB') ''').collect() >>> df = session.table("monthly_sales") >>> df.pivot("month", ['JAN', 'FEB']).sum("amount").sort(df["empid"]).show() ------------------------------- |"EMPID" |"'JAN'" |"'FEB'" | ------------------------------- |1 |10400 |8000 | |2 |39500 |200 | ------------------------------- >>> df = session.table("monthly_sales") >>> df.pivot("month").sum("amount").sort("empid").show() ------------------------------- |"EMPID" |"'FEB'" |"'JAN'" | ------------------------------- |1 |8000 |10400 | |2 |200 |39500 | ------------------------------- >>> subquery_df = session.table("monthly_sales").select(col("month")).filter(col("month") == "JAN") >>> df = session.table("monthly_sales") >>> df.pivot("month", values=subquery_df).sum("amount").sort("empid").show() --------------------- |"EMPID" |"'JAN'" | --------------------- |1 |10400 | |2 |39500 | ---------------------
- Parameters:
pivot_col – The column or name of the column to use.
values – A list of values in the column, or dynamic based on the DataFrame query, or None (default) will use all values of the pivot column.
default_on_null – Expression to replace empty result values.