snowflake.snowpark.modin.plugin.extensions.groupby_overrides.DataFrameGroupBy.sum¶
- DataFrameGroupBy.sum(numeric_only: bool = False, min_count: int = 0, engine: Optional[Literal['cython', 'numba']] = None, engine_kwargs: Optional[dict[str, bool]] = None)[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.26.0/snowpark-python/src/snowflake/snowpark/modin/plugin/extensions/groupby_overrides.py#L927-L944)¶
Compute sum of group values.
- Parameters:
numeric_only (bool, default False) – Include only float, int, boolean columns.
min_count (int, default 0) – The required number of valid values to perform the operation. If fewer than
min_count
non-NA values are present the result will be NA.engine (str, default None None) –
'cython'
: Runs rolling apply through C-extensions from cython.'numba'
Runs rolling apply through JIT compiled code from numba.Only available when
raw
is set toTrue
.
None
: Defaults to'cython'
or globally settingcompute.use_numba
This parameter is ignored in Snowpark pandas. The execution engine will always be Snowflake.
engine_kwargs (dict, default None None) –
For
'cython'
engine, there are no acceptedengine_kwargs
- For
'numba'
engine, the engine can acceptnopython
,nogil
and
parallel
dictionary keys. The values must either beTrue
orFalse
. The defaultengine_kwargs
for the'numba'
engine is{'nopython': True, 'nogil': False, 'parallel': False}
and will be applied to both thefunc
and theapply
groupby aggregation.
- For
This parameter is ignored in Snowpark pandas. The execution engine will always be Snowflake.
- Returns:
Computed sum of values within each group.
- Return type:
Examples
For SeriesGroupBy:
>>> lst = ['a', 'a', 'b', 'b'] >>> ser = pd.Series([1, 2, 3, 4], index=lst) >>> ser a 1 a 2 b 3 b 4 dtype: int64 >>> ser.groupby(level=0).sum() a 3 b 7 dtype: int64
For DataFrameGroupBy:
>>> data = [[1, 8, 2], [1, 2, 5], [2, 5, 8], [2, 6, 9]] >>> df = pd.DataFrame(data, columns=["a", "b", "c"], ... index=["tiger", "leopard", "cheetah", "lion"]) >>> df a b c tiger 1 8 2 leopard 1 2 5 cheetah 2 5 8 lion 2 6 9 >>> df.groupby("a").sum() b c a 1 10 7 2 11 17