snowflake.snowpark.modin.plugin.extensions.groupby_overrides.DataFrameGroupBy.aggregate¶
- DataFrameGroupBy.aggregate(func: Optional[Union[Callable, str, list[Union[Callable, str]], MutableMapping[Hashable, Union[Callable, str, list[Union[Callable, str]]]]]] = None, *args: Any, engine: Optional[Literal['cython', 'numba']] = None, engine_kwargs: Optional[dict[str, bool]] = None, **kwargs: Any)[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.23.0/src/snowflake/snowpark/modin/plugin/extensions/groupby_overrides.py#L255-L332)¶
- Aggregate using one or more operations over the specified axis. - Parameters:
- func (function, str, list, or dict) – - Function to use for aggregating the data. If a function, must either work when passed a - DataFrameor when passed to DataFrame.apply.- Accepted combinations are: - function 
- string function name 
- list of functions and/or function names, e.g. - [np.sum, 'mean']
- dict of axis labels -> functions, function names or list of such. 
 
- *args – Positional arguments to pass to func. 
- engine (str, default None) – - 'cython': Runs the function through C-extensions from cython.
- 'numba': Runs the function through JIT compiled code from numba.
- None: Defaults to- 'cython'or globally setting- compute.use_numba
 - This parameter is ignored in Snowpark pandas. The execution engine will always be Snowflake. 
- engine_kwargs (dict, default None) – - For - 'cython'engine, there are no accepted- engine_kwargs
- For - 'numba'engine, the engine can accept- nopython,- nogiland- paralleldictionary keys. The values must either be- Trueor- False. The default- engine_kwargsfor the- 'numba'engine is- {'nopython': True, 'nogil': False, 'parallel': False}and will be applied to the function
 - This parameter is ignored in Snowpark pandas. The execution engine will always be Snowflake. 
- **kwargs – keyword arguments to be passed into func. 
 
- Return type:
 - Examples - >>> df = pd.DataFrame( ... { ... "A": [1, 1, 2, 2], ... "B": [1, 2, 3, 4], ... "C": [0.362838, 0.227877, 1.267767, -0.562860], ... } ... ) - >>> df A B C 0 1 1 0.362838 1 1 2 0.227877 2 2 3 1.267767 3 2 4 -0.562860 - Apply a single aggregation to all columns: - >>> df.groupby('A').agg('min') B C A 1 1 0.227877 2 3 -0.562860 - Apply multiple aggregations to all columns: - >>> df.groupby('A').agg(['min', 'max']) B C min max min max A 1 1 2 0.227877 0.362838 2 3 4 -0.562860 1.267767 - Select a single column and apply aggregations: - >>> df.groupby('A').B.agg(['min', 'max']) min max A 1 1 2 2 3 4 - Apply different aggregations to specific columns: - >>> df.groupby('A').agg({'B': ['min', 'max'], 'C': 'sum'}) B C min max sum A 1 1 2 0.590715 2 3 4 0.704907