snowflake.snowpark.DataFrame.group_by_grouping_sets¶
- DataFrame.group_by_grouping_sets(*grouping_sets: Union[GroupingSets, Iterable[snowflake.snowpark.GroupingSets]]) RelationalGroupedDataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe.py#L1509-L1550)¶
Performs a SQL GROUP BY GROUPING SETS. on the DataFrame.
GROUP BY GROUPING SETS is an extension of the GROUP BY clause that allows computing multiple GROUP BY clauses in a single statement. The group set is a set of dimension columns.
GROUP BY GROUPING SETS is equivalent to the UNION of two or more GROUP BY operations in the same result set.
Examples:
>>> from snowflake.snowpark import GroupingSets >>> df = session.create_dataframe([[1, 2, 10], [3, 4, 20], [1, 4, 30]], schema=["A", "B", "C"]) >>> df.group_by_grouping_sets(GroupingSets([col("a")])).count().collect() [Row(A=1, COUNT=2), Row(A=3, COUNT=1)] >>> df.group_by_grouping_sets(GroupingSets(col("a"))).count().collect() [Row(A=1, COUNT=2), Row(A=3, COUNT=1)] >>> df.group_by_grouping_sets(GroupingSets([col("a")], [col("b")])).count().collect() [Row(A=1, B=None, COUNT=2), Row(A=3, B=None, COUNT=1), Row(A=None, B=2, COUNT=1), Row(A=None, B=4, COUNT=2)] >>> df.group_by_grouping_sets(GroupingSets([col("a"), col("b")], [col("c")])).count().collect() [Row(A=None, B=None, C=10, COUNT=1), Row(A=None, B=None, C=20, COUNT=1), Row(A=None, B=None, C=30, COUNT=1), Row(A=1, B=2, C=None, COUNT=1), Row(A=3, B=4, C=None, COUNT=1), Row(A=1, B=4, C=None, COUNT=1)]
- Parameters:
grouping_sets – The list of
GroupingSets
to group by.