snowflake.snowpark.DataFrameAIFunctions.summarize_agg¶
- DataFrameAIFunctions.summarize_agg(input_column: Union[snowflake.snowpark.column.Column, str], *, output_column: Optional[str] = None) snowflake.snowpark.DataFrame[source] (https://github.com/snowflakedb/snowpark-python/blob/v1.42.0/src/snowflake/snowpark/dataframe_ai_functions.py#L962-L1062)¶
- Summarize a column of text data using AI. - This method aggregates and summarizes text data from multiple rows into a single comprehensive summary. It’s particularly useful for creating summaries from collections of reviews, feedback, transcripts, or other text content. - Parameters:
- input_column – The column (Column object or column name as string) containing the text data to summarize. 
- output_column – The name of the output column to be appended. If not provided, a column named - AI_SUMMARIZE_AGG_OUTPUTis appended.
 
- Returns:
- A new DataFrame with a single row containing the summarized text. 
 - Examples: - >>> # Summarize product reviews >>> df = session.create_dataframe([ ... ["The product quality is excellent and shipping was fast."], ... ["Great value for money, highly recommend!"], ... ["Customer service was very helpful and responsive."], ... ["The packaging could be better, but the product itself is good."], ... ["Easy to use and works as advertised."], ... ], schema=["review"]) >>> summary_df = df.ai.summarize_agg( ... input_column="review", ... output_column="reviews_summary" ... ) >>> summary_df.columns ['REVIEWS_SUMMARY'] >>> summary_df.count() 1 >>> results = summary_df.collect() >>> len(results[0]["REVIEWS_SUMMARY"]) > 10 True >>> # Summarize with Column object >>> from snowflake.snowpark.functions import col >>> df = session.create_dataframe([ ... ["Meeting started with project updates"], ... ["Discussed timeline and deliverables"], ... ["Identified key risks and mitigation strategies"], ... ["Assigned action items to team members"], ... ], schema=["meeting_notes"]) >>> summary_df = df.ai.summarize_agg( ... input_column=col("meeting_notes"), ... output_column="meeting_summary" ... ) >>> summary_df.columns ['MEETING_SUMMARY'] >>> summary_df.count() 1 - Note - This is an aggregation function that combines multiple rows into a single summary 
- For best results, provide clear and coherent text in the input column 
- The summary will capture the main themes and important points from all input rows 
- Unlike the - aggmethod which requires a task description,- summarize_aggautomatically generates a comprehensive summary
 - This function or method is experimental since 1.39.0.