snowflake.snowpark.DataFrame.flatten¶
- DataFrame.flatten(input: Union[Column, str], path: Optional[str] = None, outer: bool = False, recursive: bool = False, mode: str = 'BOTH') DataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe.py#L3092-L3168)¶
Flattens (explodes) compound values into multiple rows.
It creates a new
DataFrame
from thisDataFrame
, carries the existing columns to the newDataFrame
, and adds the following columns to it:SEQ
KEY
PATH
INDEX
VALUE
THIS
References: Snowflake SQL function FLATTEN.
If this
DataFrame
also has columns with the names above, you can disambiguate the columns by renaming them.Example:
>>> table1 = session.sql("select parse_json(numbers) as numbers from values('[1,2]') as T(numbers)") >>> flattened = table1.flatten(table1["numbers"]) >>> flattened.select(table1["numbers"], flattened["value"].as_("flattened_number")).show() ---------------------------------- |"NUMBERS" |"FLATTENED_NUMBER" | ---------------------------------- |[ |1 | | 1, | | | 2 | | |] | | |[ |2 | | 1, | | | 2 | | |] | | ----------------------------------
- Parameters:
input – The name of a column or a
Column
instance that will be unseated into rows. The column data must be of Snowflake data type VARIANT, OBJECT, or ARRAY.path – The path to the element within a VARIANT data structure which needs to be flattened. The outermost element is to be flattened if path is empty or
None
.outer – If
False
, any input rows that cannot be expanded, either because they cannot be accessed in thepath
or because they have zero fields or entries, are completely omitted from the output. Otherwise, exactly one row is generated for zero-row expansions (with NULL in the KEY, INDEX, and VALUE columns).recursive – If
False
, only the element referenced bypath
is expanded. Otherwise, the expansion is performed for all sub-elements recursively.mode – Specifies which types should be flattened “OBJECT”, “ARRAY”, or “BOTH”.
- Returns:
A new
DataFrame
that has the columns carried from thisDataFrame
, the flattened new columns and new rows.
See also
Session.flatten()
, which creates a newDataFrame
by flattening compound values into multiple rows.
This function or method is deprecated since 0.7.0. Use
join_table_function()
instead.