Streaming output

This topic describes how to stream real-time responses from the Cortex Code Agent SDK.

By default, the SDK yields complete AssistantMessage objects after the model finishes generating each response. To receive incremental updates as text and thinking blocks are generated, enable partial message streaming by setting includePartialMessages (TypeScript) or include_partial_messages (Python) to true.

When partial messages are enabled, Cortex Code emits StreamEvent objects for partial text and thinking content. Complete tool calls still arrive as AssistantMessage objects, and tool results still arrive as UserMessage objects.

Enable streaming output

When enabled, the SDK yields StreamEvent messages containing partial streaming events, in addition to the usual AssistantMessage, UserMessage, and ResultMessage objects. Your code needs to:

  1. Check each message’s type to distinguish StreamEvent from other types.

  2. For StreamEvent, extract the event field and check its type.

  3. Look for content_block_delta events where delta.type is text_delta.

import { query } from "cortex-code-agent-sdk";

for await (const message of query({
  prompt: "List the files in my project",
  options: {
    cwd: process.cwd(),
    includePartialMessages: true,
    allowedTools: ["Bash", "Read"],
  },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta") {
      if (event.delta.type === "text_delta") {
        process.stdout.write(event.delta.text);
      }
    }
  }
}

StreamEvent reference

When partial messages are enabled, you receive raw streaming events wrapped in an object:

interface SDKPartialAssistantMessage {
  type: "stream_event";
  event: Record<string, unknown>;  // Raw streaming event
  parent_tool_use_id: string | null;
  uuid: string;
  session_id: string;
}

The event field contains the raw partial streaming event emitted by Cortex Code. Common event types:

Event Type

Description

content_block_start

Start of a new text or thinking block

content_block_delta

Incremental text or thinking update

content_block_stop

End of the current text or thinking block

Message flow

With partial messages enabled, you commonly receive messages in the following order:

SystemMessage -- session initialization
StreamEvent (content_block_start) -- text or thinking block
StreamEvent (content_block_delta) -- text_delta or thinking_delta chunks...
StreamEvent (content_block_stop)
AssistantMessage -- complete text/thinking block, or complete tool_use block
UserMessage -- complete tool_result block
... more assistant/user turns ...
ResultMessage -- final result

Without partial messages enabled, you still receive the same complete assistant, user, and result messages, but not StreamEvent. Depending on the session, the SDK can also emit system events such as initialization, status, and background-task notifications.

Stream text responses

To display text as it’s generated, look for content_block_delta events where delta.type is text_delta:

import { query } from "cortex-code-agent-sdk";

for await (const message of query({
  prompt: "Explain how databases work",
  options: { cwd: process.cwd(), includePartialMessages: true },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
}
console.log(); // Final newline

Build a streaming UI

The following example accumulates streamed text in a local buffer and re-renders the current response each time a new text_delta arrives. In a real application, replace the render function with your framework’s state update logic:

import { query } from "cortex-code-agent-sdk";

let currentText = "";

function render(text: string) {
  console.clear();
  console.log("Assistant:\n");
  process.stdout.write(text);
}

for await (const message of query({
  prompt: "Explain how databases work",
  options: {
    cwd: process.cwd(),
    includePartialMessages: true,
  },
})) {
  if (message.type === "stream_event") {
    const event = message.event;
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      currentText += event.delta.text;
      render(currentText);
    }
  } else if (message.type === "result") {
    console.log("\n\n--- Complete ---");
  }
}

Known limitations

Feature

Impact on streaming

Structured output

JSON result appears only in ResultMessage.structured_output, not as streaming deltas