This topic describes how to return schema-validated JSON from agent workflows using JSON Schema. The agent can use any
tools it needs to complete the task, and on successful validation the result includes structured data that matches your
schema.
Define a JSON Schema (https://json-schema.org/understanding-json-schema/about) for the structure you need, and the
SDK validates the model’s final output against it.
Agents return free-form text by default, which works for conversational use cases but not when you need to use the
output programmatically. Structured outputs provide typed data you can pass directly to your application logic,
database, or UI components.
Consider an agent that analyzes a codebase. Without structured outputs you get free-form text that you would need to
parse yourself. With structured outputs you define the shape you want and get typed data you can use directly:
Without structured outputs
With structured outputs
This codebase uses Python and
TypeScript. It has 42 files
and the main entry point is...
Pass a JSON Schema to the outputFormat (TypeScript) or output_format (Python) option. When validation
succeeds, the result message includes a structured_output field with data matching your schema. If the agent
cannot satisfy the schema after retries, the SDK returns an error result instead.
import{query}from"cortex-code-agent-sdk";constschema={type:"object",properties:{company_name:{type:"string"},founded_year:{type:"number"},headquarters:{type:"string"},},required:["company_name"],};forawait(constmessageofquery({prompt:"Research Snowflake and provide key company information",options:{cwd:process.cwd(),outputFormat:{type:"json_schema",schema},},})){if(message.type==="result"&&message.structured_output){console.log(message.structured_output);// { company_name: "Snowflake", founded_year: 2012, headquarters: "Bozeman, MT" }}}
importasynciofromcortex_code_agent_sdkimportquery,CortexCodeAgentOptions,ResultMessageschema={"type":"object","properties":{"company_name":{"type":"string"},"founded_year":{"type":"number"},"headquarters":{"type":"string"},},"required":["company_name"],}asyncdefmain():asyncformessageinquery(prompt="Research Snowflake and provide key company information",options=CortexCodeAgentOptions(cwd=".",output_format={"type":"json_schema","schema":schema},),):ifisinstance(message,ResultMessage)andmessage.structured_output:print(message.structured_output)# {'company_name': 'Snowflake', 'founded_year': 2012, 'headquarters': 'Bozeman, MT'}asyncio.run(main())
Instead of writing JSON Schema by hand, use Zod (https://zod.dev/) (TypeScript) or
Pydantic (https://docs.pydantic.dev/latest/) (Python) to define your schema. These libraries generate the
JSON Schema for you and let you parse the response into a fully typed object with autocomplete and type checking.
import{z}from"zod";import{query}from"cortex-code-agent-sdk";constFeaturePlan=z.object({feature_name:z.string(),summary:z.string(),steps:z.array(z.object({step_number:z.number(),description:z.string(),estimated_complexity:z.enum(["low","medium","high"]),})),risks:z.array(z.string()),});typeFeaturePlan=z.infer<typeofFeaturePlan>;constschema=z.toJSONSchema(FeaturePlan);forawait(constmessageofquery({prompt:"Plan how to add dark mode support to a React app.",options:{cwd:process.cwd(),outputFormat:{type:"json_schema",schema},},})){if(message.type==="result"&&message.structured_output){constparsed=FeaturePlan.safeParse(message.structured_output);if(parsed.success){constplan:FeaturePlan=parsed.data;console.log(`Feature: ${plan.feature_name}`);plan.steps.forEach((step)=>{console.log(`${step.step_number}. [${step.estimated_complexity}] ${step.description}`);});}}}
importasynciofrompydanticimportBaseModelfromcortex_code_agent_sdkimportquery,CortexCodeAgentOptions,ResultMessageclassStep(BaseModel):step_number:intdescription:strestimated_complexity:str# 'low', 'medium', 'high'classFeaturePlan(BaseModel):feature_name:strsummary:strsteps:list[Step]risks:list[str]asyncdefmain():asyncformessageinquery(prompt="Plan how to add dark mode support to a React app.",options=CortexCodeAgentOptions(cwd=".",output_format={"type":"json_schema","schema":FeaturePlan.model_json_schema(),},),):ifisinstance(message,ResultMessage)andmessage.structured_output:plan=FeaturePlan.model_validate(message.structured_output)print(f"Feature: {plan.feature_name}")forstepinplan.steps:print(f"{step.step_number}. [{step.estimated_complexity}] {step.description}")asyncio.run(main())
This example shows structured outputs with multi-step tool use. The agent finds TODO comments in a codebase using
built-in tools (Grep, Bash), then returns the results as structured data. Optional fields like author handle cases
where git blame information may not be available.
import{query}from"cortex-code-agent-sdk";consttodoSchema={type:"object",properties:{todos:{type:"array",items:{type:"object",properties:{text:{type:"string"},file:{type:"string"},line:{type:"number"},author:{type:"string"},date:{type:"string"},},required:["text","file","line"],},},total_count:{type:"number"},},required:["todos","total_count"],};forawait(constmessageofquery({prompt:"Find all TODO comments in this codebase and identify who added them",options:{cwd:process.cwd(),outputFormat:{type:"json_schema",schema:todoSchema},},})){if(message.type==="result"&&message.structured_output){constdata=message.structured_output;console.log(`Found ${data.total_count} TODOs`);data.todos.forEach((todo)=>{console.log(`${todo.file}:${todo.line} - ${todo.text}`);if(todo.author){console.log(` Added by ${todo.author} on ${todo.date}`);}});}}
importasynciofromcortex_code_agent_sdkimportquery,CortexCodeAgentOptions,ResultMessagetodo_schema={"type":"object","properties":{"todos":{"type":"array","items":{"type":"object","properties":{"text":{"type":"string"},"file":{"type":"string"},"line":{"type":"number"},"author":{"type":"string"},"date":{"type":"string"},},"required":["text","file","line"],},},"total_count":{"type":"number"},},"required":["todos","total_count"],}asyncdefmain():asyncformessageinquery(prompt="Find all TODO comments in this codebase and identify who added them",options=CortexCodeAgentOptions(cwd=".",output_format={"type":"json_schema","schema":todo_schema},),):ifisinstance(message,ResultMessage)andmessage.structured_output:data=message.structured_outputprint(f"Found {data['total_count']} TODOs")fortodoindata["todos"]:print(f"{todo['file']}:{todo['line']} - {todo['text']}")if"author"intodo:print(f" Added by {todo['author']} on {todo['date']}")asyncio.run(main())
Cortex Code has built-in Snowflake SQL tools. You can combine them with structured output to get typed query results:
import{query}from"cortex-code-agent-sdk";constschema={type:"object",properties:{top_customers:{type:"array",items:{type:"object",properties:{name:{type:"string"},total_revenue:{type:"number"},order_count:{type:"number"},},required:["name","total_revenue","order_count"],},},query_used:{type:"string"},},required:["top_customers","query_used"],};forawait(constmessageofquery({prompt:"Find the top 5 customers by revenue from the ORDERS table",options:{cwd:process.cwd(),connection:"my-connection",outputFormat:{type:"json_schema",schema},},})){if(message.type==="result"&&message.structured_output){const{top_customers,query_used}=message.structured_output;console.log(`Query: ${query_used}`);top_customers.forEach((c)=>{console.log(`${c.name}: $${c.total_revenue} (${c.order_count} orders)`);});}}
importasynciofromcortex_code_agent_sdkimportquery,CortexCodeAgentOptions,ResultMessageschema={"type":"object","properties":{"top_customers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"total_revenue":{"type":"number"},"order_count":{"type":"number"},},"required":["name","total_revenue","order_count"],},},"query_used":{"type":"string"},},"required":["top_customers","query_used"],}asyncdefmain():asyncformessageinquery(prompt="Find the top 5 customers by revenue from the ORDERS table",options=CortexCodeAgentOptions(cwd=".",connection="my-connection",output_format={"type":"json_schema","schema":schema},),):ifisinstance(message,ResultMessage)andmessage.structured_output:data=message.structured_outputprint(f"Query: {data['query_used']}")forcindata["top_customers"]:print(f"{c['name']}: ${c['total_revenue']} ({c['order_count']} orders)")asyncio.run(main())
The outputFormat (TypeScript) or output_format (Python) option accepts an object with the following fields:
Field
Value
Description
type
"json_schema"
Required. Only json_schema is supported.
schema
JSON Schema object
Defines the output structure. Generate from Zod with z.toJSONSchema() or Pydantic with
.model_json_schema().
Standard JSON Schema features are supported: all basic types (object, array, string, number,
boolean, null), enum, const, required, nested objects, and $ref definitions.
Structured output generation can fail when the agent cannot produce valid JSON matching your schema. When this happens,
the result message has a subtype indicating what went wrong:
Subtype
Meaning
success
Output was generated and validated successfully
error_max_structured_output_retries
Agent could not produce valid output after multiple attempts
forawait(constmsgofquery({prompt:"Extract contact info from the document",options:{cwd:process.cwd(),outputFormat:{type:"json_schema",schema:contactSchema},},})){if(msg.type==="result"){if(msg.subtype==="success"&&msg.structured_output){console.log(msg.structured_output);}elseif(msg.subtype==="error_max_structured_output_retries"){console.error("Could not produce valid output");}}}
asyncformessageinquery(prompt="Extract contact info from the document",options=CortexCodeAgentOptions(cwd=".",output_format={"type":"json_schema","schema":contact_schema},),):ifisinstance(message,ResultMessage):ifmessage.subtype=="success"andmessage.structured_output:print(message.structured_output)elifmessage.subtype=="error_max_structured_output_retries":print("Could not produce valid output")
Tip
Tips for avoiding errors:
Keep schemas focused. Deeply nested schemas with many required fields are harder to satisfy. Start simple and
add complexity as needed.
Match schema to task. If the task might not have all the information your schema requires, make those fields
optional.
Use clear prompts. Ambiguous prompts make it harder for the agent to know what output to produce.
Where your configuration of Cortex Code uses a model provided on the
Model and Service Pass-Through Terms,
your use of that model is further subject to the terms for that model on that page.
The data classification of inputs and outputs are as set forth in the following table.