snowflake.snowpark.DataFrame.cache_result¶
- DataFrame.cache_result(*, statement_params: Optional[Dict[str, str]] = None) Table [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/dataframe.py#L3776-L3862)¶
Caches the content of this DataFrame to create a new cached Table DataFrame.
All subsequent operations on the returned cached DataFrame are performed on the cached data and have no effect on the original DataFrame.
You can use
Table.drop_table()
or thewith
statement to clean up the cached result when it’s not needed. Refer to the example code below.Note
An error will be thrown if a cached result is cleaned up and it’s used again, or any other DataFrames derived from the cached result are used again.
- Examples::
>>> create_result = session.sql("create temp table RESULT (NUM int)").collect() >>> insert_result = session.sql("insert into RESULT values(1),(2)").collect()
>>> df = session.table("RESULT") >>> df.collect() [Row(NUM=1), Row(NUM=2)]
>>> # Run cache_result and then insert into the original table to see >>> # that the cached result is not affected >>> df1 = df.cache_result() >>> insert_again_result = session.sql("insert into RESULT values (3)").collect() >>> df1.collect() [Row(NUM=1), Row(NUM=2)] >>> df.collect() [Row(NUM=1), Row(NUM=2), Row(NUM=3)]
>>> # You can run cache_result on a result that has already been cached >>> df2 = df1.cache_result() >>> df2.collect() [Row(NUM=1), Row(NUM=2)]
>>> df3 = df.cache_result() >>> # Drop RESULT and see that the cached results still exist >>> drop_table_result = session.sql(f"drop table RESULT").collect() >>> df1.collect() [Row(NUM=1), Row(NUM=2)] >>> df2.collect() [Row(NUM=1), Row(NUM=2)] >>> df3.collect() [Row(NUM=1), Row(NUM=2), Row(NUM=3)] >>> # Clean up the cached result >>> df3.drop_table() >>> # use context manager to clean up the cached result after it's use. >>> with df2.cache_result() as df4: ... df4.collect() [Row(NUM=1), Row(NUM=2)]
- Parameters:
statement_params – Dictionary of statement level parameters to be set while executing this action.
- Returns:
A
Table
object that holds the cached result in a temporary table. All operations on this new DataFrame have no effect on the original.