This topic describes how to stream real-time responses from the Cortex Code Agent SDK.
By default, the SDK yields complete AssistantMessage objects after the model finishes generating each response. To
receive incremental updates as text and thinking blocks are generated, enable partial message streaming by setting
includePartialMessages (TypeScript) or include_partial_messages (Python) to true.
When partial messages are enabled, Cortex Code emits StreamEvent objects for partial text and thinking content.
Complete tool calls still arrive as AssistantMessage objects, and tool results still arrive as UserMessage
objects.
When enabled, the SDK yields StreamEvent messages containing partial streaming events, in addition to the usual
AssistantMessage, UserMessage, and ResultMessage objects. Your code needs to:
Check each message’s type to distinguish StreamEvent from other types.
For StreamEvent, extract the event field and check its type.
Look for content_block_delta events where delta.type is text_delta.
import{query}from"cortex-code-agent-sdk";forawait(constmessageofquery({prompt:"List the files in my project",options:{cwd:process.cwd(),includePartialMessages:true,allowedTools:["Bash","Read"],},})){if(message.type==="stream_event"){constevent=message.event;if(event.type==="content_block_delta"){if(event.delta.type==="text_delta"){process.stdout.write(event.delta.text);}}}}
importasynciofromcortex_code_agent_sdkimportquery,CortexCodeAgentOptionsfromcortex_code_agent_sdk.typesimportStreamEventasyncdefstream_response():asyncformessageinquery(prompt="List the files in my project",options=CortexCodeAgentOptions(cwd=".",include_partial_messages=True,allowed_tools=["Bash","Read"],),):ifisinstance(message,StreamEvent):event=message.eventifevent.get("type")=="content_block_delta":delta=event.get("delta",{})ifdelta.get("type")=="text_delta":print(delta.get("text",""),end="",flush=True)asyncio.run(stream_response())
When partial messages are enabled, you receive raw streaming events wrapped in an object:
interfaceSDKPartialAssistantMessage{type:"stream_event";event:Record<string,unknown>;// Raw streaming eventparent_tool_use_id:string|null;uuid:string;session_id:string;}
@dataclassclassStreamEvent:uuid:str# Unique identifiersession_id:str# Session identifierevent:dict[str,Any]# Raw streaming eventparent_tool_use_id:str|None# Parent tool ID if from a subagent
The event field contains the raw partial streaming event emitted by Cortex Code. Common event types:
With partial messages enabled, you commonly receive messages in the following order:
SystemMessage -- session initialization
StreamEvent (content_block_start) -- text or thinking block
StreamEvent (content_block_delta) -- text_delta or thinking_delta chunks...
StreamEvent (content_block_stop)
AssistantMessage -- complete text/thinking block, or complete tool_use block
UserMessage -- complete tool_result block
... more assistant/user turns ...
ResultMessage -- final result
Without partial messages enabled, you still receive the same complete assistant, user, and result messages, but not
StreamEvent. Depending on the session, the SDK can also emit system events such as initialization, status, and
background-task notifications.
To display text as it’s generated, look for content_block_delta events where delta.type is text_delta:
import{query}from"cortex-code-agent-sdk";forawait(constmessageofquery({prompt:"Explain how databases work",options:{cwd:process.cwd(),includePartialMessages:true},})){if(message.type==="stream_event"){constevent=message.event;if(event.type==="content_block_delta"&&event.delta.type==="text_delta"){process.stdout.write(event.delta.text);}}}console.log();// Final newline
importasynciofromcortex_code_agent_sdkimportquery,CortexCodeAgentOptionsfromcortex_code_agent_sdk.typesimportStreamEventasyncdefstream_text():asyncformessageinquery(prompt="Explain how databases work",options=CortexCodeAgentOptions(cwd=".",include_partial_messages=True),):ifisinstance(message,StreamEvent):event=message.eventifevent.get("type")=="content_block_delta":delta=event.get("delta",{})ifdelta.get("type")=="text_delta":print(delta.get("text",""),end="",flush=True)print()# Final newlineasyncio.run(stream_text())
The following example accumulates streamed text in a local buffer and re-renders the current response each time a new
text_delta arrives. In a real application, replace the render function with your framework’s state update
logic:
import{query}from"cortex-code-agent-sdk";letcurrentText="";functionrender(text:string){console.clear();console.log("Assistant:\n");process.stdout.write(text);}forawait(constmessageofquery({prompt:"Explain how databases work",options:{cwd:process.cwd(),includePartialMessages:true,},})){if(message.type==="stream_event"){constevent=message.event;if(event.type==="content_block_delta"&&event.delta.type==="text_delta"){currentText+=event.delta.text;render(currentText);}}elseif(message.type==="result"){console.log("\n\n--- Complete ---");}}
importasyncioimportsysfromcortex_code_agent_sdkimportquery,CortexCodeAgentOptions,ResultMessagefromcortex_code_agent_sdk.typesimportStreamEventdefrender(text:str)->None:sys.stdout.write("\033[2J\033[H")sys.stdout.write("Assistant:\n\n")sys.stdout.write(text)sys.stdout.flush()asyncdefstreaming_ui():current_text=""asyncformessageinquery(prompt="Explain how databases work",options=CortexCodeAgentOptions(cwd=".",include_partial_messages=True,),):ifisinstance(message,StreamEvent):event=message.eventifevent.get("type")=="content_block_delta":delta=event.get("delta",{})ifdelta.get("type")=="text_delta":current_text+=delta.get("text","")render(current_text)elifisinstance(message,ResultMessage):print("\n\n--- Complete ---")asyncio.run(streaming_ui())
Where your configuration of Cortex Code uses a model provided on the
Model and Service Pass-Through Terms,
your use of that model is further subject to the terms for that model on that page.
The data classification of inputs and outputs are as set forth in the following table.