snowflake.snowpark.Column.within_group¶
- Column.within_group(*cols: Union[Column, str, Iterable[Union[Column, str]]]) Column [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/column.py#L671-L729)¶
Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified columns.
This method is supported on Column expressions returned by some of the aggregate functions, including
functions.array_agg()
,functions.listagg()
, PERCENTILE_CONT(), and PERCENTILE_DISC(). For details, see the Snowflake documentation for the aggregate function that you are using (e.g. ARRAY_AGG).Examples:
>>> from snowflake.snowpark.functions import array_agg, col >>> from snowflake.snowpark import Window >>> df = session.create_dataframe([(3, "v1"), (1, "v3"), (2, "v2")], schema=["a", "b"]) >>> # create a DataFrame containing the values in "a" sorted by "b" >>> df.select(array_agg("a").within_group("b").alias("new_column")).show() ---------------- |"NEW_COLUMN" | ---------------- |[ | | 3, | | 2, | | 1 | |] | ---------------- >>> # create a DataFrame containing the values in "a" grouped by "b" >>> # and sorted by "a" in descending order. >>> df_array_agg_window = df.select(array_agg("a").within_group(col("a").desc()).over(Window.partitionBy(col("b"))).alias("new_column")) >>> df_array_agg_window.show() ---------------- |"NEW_COLUMN" | ---------------- |[ | | 3 | |] | |[ | | 1 | |] | |[ | | 2 | |] | ----------------