snowflake.snowpark.DataFrame.crosstab¶
- DataFrame.crosstab(col1: Union[Column, str], col2: Union[Column, str], *, statement_params: Optional[Dict[str, str]] = None) DataFrame[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.23.0/src/snowflake/snowpark/dataframe_stat_functions.py#L167-L220)¶
- Computes a pair-wise frequency table (a - contingency table) for the specified columns. The method returns a DataFrame containing this table.- In the returned contingency table:
- The first column of each row contains the distinct values of - col1.
- The name of the first column is the name of - col1.
- The rest of the column names are the distinct values of - col2.
- For pairs that have no occurrences, the contingency table contains 0 as the count. 
 
 - Note - The number of distinct values in - col2should not exceed 1000.- Example: - >>> df = session.create_dataframe([(1, 1), (1, 2), (2, 1), (2, 1), (2, 3), (3, 2), (3, 3)], schema=["key", "value"]) >>> ct = df.stat.crosstab("key", "value").sort(df["key"]) >>> ct.show() --------------------------------------------------------------------------------------------- |"KEY" |"CAST(1 AS NUMBER(38,0))" |"CAST(2 AS NUMBER(38,0))" |"CAST(3 AS NUMBER(38,0))" | --------------------------------------------------------------------------------------------- |1 |1 |1 |0 | |2 |2 |0 |1 | |3 |0 |1 |1 | --------------------------------------------------------------------------------------------- - Parameters:
- col1 – The name of the first column to use. 
- col2 – The name of the second column to use. 
- statement_params – Dictionary of statement level parameters to be set while executing this action.