Agent Chat
Agent Chat lets you talk directly to a single agent persona — no debate, no synthesis, just the agent responding in full character using its system prompt, personality, mission, and persistent memory.
Use it to embed advisor bots in your product, build persona-driven chatbots, or run automated advisory flows for your users.
Sessions
A session groups multiple chat turns into one conversation thread. The agent receives the full message history on each turn, giving it context across the entire session. Sessions are optional — if you omit sessionId on a chat call, a new session is created automatically and its ID is returned so you can continue the thread.
Sending a Message
The core endpoint. Send a message, get the agent's response. The agent's full system prompt, persona fields (role, personality, mission, decision style), and all its accumulated memories are injected on every call — no setup required beyond passing a message.
Minimal call — just a message, no extras:
Full call with context, extraction, and reflection:
Context Injection
Pass structured data alongside your message using the context array. Each item becomes a labelled block appended to the agent's system prompt before the LLM call — the agent sees and reasons over the data as if it were part of its briefing.
priority: "high" to push critical data (financials, live metrics) to the top of the context block. Lower-priority items appear below. The agent reads high-priority context first.Context shape:
{
"category": "financials", // label shown to the agent
"data": { "arr": 850000, "burn": 42000 }, // object or string
"label": "Company Financials", // optional override for the heading
"priority": "high" // "high" | "medium" | "low"
}Context is ephemeral — it applies only to the current call. If you need the agent to remember data across sessions, let it save a <MEMORY> tag (see below) or pre-seed memories via POST /v1/agents/:id/memory.
Memory System
Agents accumulate memories automatically. When the LLM determines something is worth remembering — a user preference, a key decision, a company fact — it emits a <MEMORY> tag in its response. These tags are stripped from the output the client receives and persisted to the agent's memory store.
<!-- emitted by the LLM, stripped before the response reaches your app --> <MEMORY key="user_risk_appetite" type="preference"> Conservative — prefers validated growth channels over experimental spend </MEMORY> <MEMORY key="company_arr" type="fact" global="true"> Current ARR: $850K as of Q2 2026 </MEMORY>
Memories are agent-scoped, not session-scoped. An agent that remembers something in session A will recall it in session B, C, and beyond. Global memories (tagged global="true") are shared across all agents in the workspace.
Inspect, add, or delete an agent's memories using the Agent Memory endpoints on the Agents page.
Fact Extraction
When you include the extraction field, a lightweight secondary LLM call extracts structured facts from the user message after the main response. Extracted facts are tagged by category and stored in ExtractedData.
"extraction": {
"categories": ["company_info", "hiring_decisions", "preferences"],
"persist": true // write to ExtractedData (default)
}Extracted facts are also returned inline in the extracted array of each response:
"extracted": [
{ "category": "company_info", "key": "arr", "value": "850000", "confidence": 0.97 },
{ "category": "hiring_decisions", "key": "vp_sales", "value": "considering for Q3 2026", "confidence": 0.88 }
]Query persisted facts at any time:
Reflection
Set options.reflection: true to enable two-pass thinking. Before writing its final answer, the agent emits a <THINK> block where it reasons through the problem — exploring angles, challenging assumptions, naming risks. That block is stripped from content and returned separately in the thoughts field.
thoughts contains the agent's complete reasoning trace. Use it to display a "show reasoning" toggle in your UI, log it for auditability, or feed it into downstream analysis. The content field always contains only the clean final response.Example response with reflection enabled:
{
"content": "With 18 months of runway, the math is borderline. MY RECOMMENDATION: delay 90 days.",
"thoughts": "The user's burn is $42K/month. A VP hire adds ~$18K/month in cost, shrinking runway from 18 to ~15 months. ARR is $850K — not enough to de-risk it yet. I should push back on the timing and frame the conditions for a green light...",
"sessionId": "conv_yyy",
...
}Response Format
By default agents respond with prose. Add responseFormat to make the agent return structured JSON instead — useful when you need to feed the output directly into your application logic without parsing free text.
responseFormat.type is "json", the memory XML tag system is automatically suspended (XML and JSON mode are incompatible). Memories already loaded into the agent are still injected as context — only the ability to emit new memories mid-response is paused for that call.Request shape:
"responseFormat": {
"type": "json",
"fields": {
"recommendation": "clear yes/no recommendation",
"reasoning": "2-3 sentence rationale",
"conditions": "bullet list of conditions that would change the answer",
"confidence": "0 to 1 confidence score"
}
}The agent will populate every field you define. Descriptions guide what the agent writes — be specific. If you prefer a free-form schema description instead of named fields, use the schema string property.
Example response with responseFormat.type: "json":
{
"content": "{\"recommendation\":\"No\",\"reasoning\":\"With 18 months runway and $850K ARR, a VP hire adds ~$18K/month burn...\",\"conditions\":[\"ARR above $1.2M\",\"Runway above 20 months\"],\"confidence\":0.82}",
"contentParsed": {
"recommendation": "No",
"reasoning": "With 18 months runway and $850K ARR, a VP hire adds ~$18K/month burn and risks eating into the runway before the hire produces pipeline.",
"conditions": ["ARR above $1.2M", "Runway above 20 months"],
"confidence": 0.82
},
"sessionId": "conv_yyy",
"memoriesSaved": [],
"extracted": [],
"tokenUsage": { "promptTokens": 540, "completionTokens": 120, "totalTokens": 660 }
}Combining JSON format with reflection. When both responseFormat.type: "json" and options.reflection: true are set, the thoughts are injected as a thoughts field inside the JSON — not separately. They are extracted from contentParsed.thoughts and returned in the top-level thoughts field as usual.
// reflection: true + responseFormat.type: "json"
{
"contentParsed": {
"recommendation": "No",
"reasoning": "...",
"confidence": 0.82
// "thoughts" field is NOT here — it was extracted
},
"thoughts": "The user's burn is $42K/month. A VP hire adds ~$18K/month..."
}Full example with structured JSON response: