snowflake.snowpark.Session.write_pandas¶
- Session.write_pandas(df: DataFrame, table_name: str, *, database: Optional[str] = None, schema: Optional[str] = None, chunk_size: Optional[int] = None, compression: str = 'gzip', on_error: str = 'abort_statement', parallel: int = 4, quote_identifiers: bool = True, auto_create_table: bool = False, create_temp_table: bool = False, overwrite: bool = False, table_type: Literal['', 'temp', 'temporary', 'transient'] = '', **kwargs: Dict[str, Any]) Table [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/session.py#L2113-L2273)¶
Writes a pandas DataFrame to a table in Snowflake and returns a Snowpark
DataFrame
object referring to the table where the pandas DataFrame was written to.- Parameters:
df – The pandas DataFrame we’d like to write back.
table_name – Name of the table we want to insert into.
database – Database that the table is in. If not provided, the default one will be used.
schema – Schema that the table is in. If not provided, the default one will be used.
chunk_size – Number of rows to be inserted once. If not provided, all rows will be dumped once. Default to None normally, 100,000 if inside a stored procedure.
compression – The compression used on the Parquet files: gzip or snappy. Gzip gives supposedly a better compression, while snappy is faster. Use whichever is more appropriate.
on_error – Action to take when COPY INTO statements fail. See details at copy options.
parallel – Number of threads to be used when uploading chunks. See details at parallel parameter.
quote_identifiers – By default, identifiers, specifically database, schema, table and column names (from
DataFrame.columns
) will be quoted. If set toFalse
, identifiers are passed on to Snowflake without quoting, i.e. identifiers will be coerced to uppercase by Snowflake.auto_create_table – When true, automatically creates a table to store the passed in pandas DataFrame using the passed in
database
,schema
, andtable_name
. Note: there are usually multiple table configurations that would allow you to upload a particular pandas DataFrame successfully. If you don’t like the auto created table, you can always create your own table before calling this function. For example, auto-created tables will storelist
,tuple
anddict
as strings in a VARCHAR column.create_temp_table – (Deprecated) The to-be-created table will be temporary if this is set to
True
. Note that to avoid breaking changes, currently when this is set to True, it overridestable_type
.overwrite – Default value is
False
and the pandas DataFrame data is appended to the existing table. If set toTrue
and if auto_create_table is also set toTrue
, then it drops the table. If set toTrue
and if auto_create_table is set toFalse
, then it truncates the table. Note that in both cases (when overwrite is set toTrue
) it will replace the existing contents of the table with that of the passed in pandas DataFrame.table_type – The table type of table to be created. The supported values are:
temp
,temporary
, andtransient
. An empty string means to create a permanent table. Learn more about table types here.
Example:
>>> import pandas as pd >>> pandas_df = pd.DataFrame([(1, "Steve"), (2, "Bob")], columns=["id", "name"]) >>> snowpark_df = session.write_pandas(pandas_df, "write_pandas_table", auto_create_table=True, table_type="temp") >>> snowpark_df.sort('"id"').to_pandas() id name 0 1 Steve 1 2 Bob >>> pandas_df2 = pd.DataFrame([(3, "John")], columns=["id", "name"]) >>> snowpark_df2 = session.write_pandas(pandas_df2, "write_pandas_table", auto_create_table=False) >>> snowpark_df2.sort('"id"').to_pandas() id name 0 1 Steve 1 2 Bob 2 3 John >>> pandas_df3 = pd.DataFrame([(1, "Jane")], columns=["id", "name"]) >>> snowpark_df3 = session.write_pandas(pandas_df3, "write_pandas_table", auto_create_table=False, overwrite=True) >>> snowpark_df3.to_pandas() id name 0 1 Jane >>> pandas_df4 = pd.DataFrame([(1, "Jane")], columns=["id", "name"]) >>> snowpark_df4 = session.write_pandas(pandas_df4, "write_pandas_transient_table", auto_create_table=True, table_type="transient") >>> snowpark_df4.to_pandas() id name 0 1 Jane
Note
Unless
auto_create_table
isTrue
, you must first create a table in Snowflake that the passed in pandas DataFrame can be written to. If your pandas DataFrame cannot be written to the specified table, an exception will be raised.