snowflake.snowpark.DataFrame.with_columns¶
- DataFrame.with_columns(col_names: List[str], values: List[Union[Column, TableFunctionCall]]) DataFrame [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe.py#L2758-L2853)¶
Returns a DataFrame with additional columns with the specified names
col_names
. The columns are computed by using the specified expressionsvalues
.If columns with the same names already exist in the DataFrame, those columns are removed and appended at the end by new columns.
Example 1:
>>> from snowflake.snowpark.functions import udtf >>> @udtf(output_schema=["number"]) ... class sum_udtf: ... def process(self, a: int, b: int) -> Iterable[Tuple[int]]: ... yield (a + b, ) >>> df = session.create_dataframe([[1, 2], [3, 4]], schema=["a", "b"]) >>> df.with_columns(["mean", "total"], [(df["a"] + df["b"]) / 2, sum_udtf(df.a, df.b)]).sort(df.a).show() ---------------------------------- |"A" |"B" |"MEAN" |"TOTAL" | ---------------------------------- |1 |2 |1.500000 |3 | |3 |4 |3.500000 |7 | ----------------------------------
Example 2:
>>> from snowflake.snowpark.functions import table_function >>> split_to_table = table_function("split_to_table") >>> df = session.sql("select 'James' as name, 'address1 address2 address3' as addresses") >>> df.with_columns(["seq", "idx", "val"], [split_to_table(df.addresses, lit(" "))]).show() ------------------------------------------------------------------ |"NAME" |"ADDRESSES" |"SEQ" |"IDX" |"VAL" | ------------------------------------------------------------------ |James |address1 address2 address3 |1 |1 |address1 | |James |address1 address2 address3 |1 |2 |address2 | |James |address1 address2 address3 |1 |3 |address3 | ------------------------------------------------------------------
- Parameters:
col_names – A list of the names of the columns to add or replace.
values – A list of the
Column
objects ortable_function.TableFunctionCall
object to add or replace.