modin.pandas.crosstab¶
- modin.pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name: str = 'All', dropna: bool = True, normalize=False) DataFrame[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.30.0/snowpark-python/src/snowflake/snowpark/modin/plugin/extensions/general_overrides.py#L171-L439)¶
- Compute a simple cross tabulation of two (or more) factors. - By default, computes a frequency table of the factors unless an array of values and an aggregation function are passed. - Parameters:
- index (array-like, Series, or list of arrays/Series) – Values to group by in the rows. 
- columns (array-like, Series, or list of arrays/Series) – Values to group by in the columns. 
- values (array-like, optional) – Array of values to aggregate according to the factors. Requires aggfunc be specified. 
- rownames (sequence, default None) – If passed, must match number of row arrays passed. 
- colnames (sequence, default None) – If passed, must match number of column arrays passed. 
- aggfunc (function, optional) – If specified, requires values be specified as well. 
- margins (bool, default False) – Add row/column margins (subtotals). 
- margins_name (str, default 'All') – Name of the row/column that will contain the totals when margins is True. 
- dropna (bool, default True) – Do not include columns whose entries are all NaN. 
- normalize (bool, {'all', 'index', 'columns'}, or {0,1}, default False) – - Normalize by dividing all values by the sum of values. - If passed ‘all’ or True, will normalize over all values. 
- If passed ‘index’ will normalize over each row. 
- If passed ‘columns’ will normalize over each column. 
- If margins is True, will also normalize margin values. 
 
 
- Returns:
- Cross tabulation of the data. 
- Return type:
- Snowpark pandas - DataFrame
 - Notes - Raises NotImplementedError if aggfunc is not one of “count”, “mean”, “min”, “max”, or “sum”, or margins is True, normalize is True or all, and values is passed. - Examples - >>> a = np.array(["foo", "foo", "foo", "foo", "bar", "bar", ... "bar", "bar", "foo", "foo", "foo"], dtype=object) >>> b = np.array(["one", "one", "one", "two", "one", "one", ... "one", "two", "two", "two", "one"], dtype=object) >>> c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny", ... "shiny", "dull", "shiny", "shiny", "shiny"], ... dtype=object) >>> pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c']) b one two c dull shiny dull shiny a bar 1 2 1 0 foo 2 2 1 2