Agent State & Memory
Persisting decisions and intermediate results across turns.
Agents that span turns need durable state: which tools have been called, what was learned, what the user has approved. Persist a minimal state vector (JSON) rather than serializing the full conversation. Reconstruct context from the state vector on each turn.
Real-world example: Multi-day SaaS support cases
A B2B SaaS support agent works cases that span days and multiple replies. Resending the full thread on every turn is wasteful — 40K tokens by Wednesday.
Instead, persist a tight state vector:
{
"case_id": "C-1042",
"customer": "Acme Co.",
"summary": "Webhook receiving duplicate events since 2026-05-08.",
"tried": ["replay tool", "dedup index check"],
"blocked_on": "customer to share their consumer log",
"next_action": "wait for log; then inspect signature mismatch theory"
}Each new turn loads this vector + the latest user message — usually under 2K tokens. The agent stays grounded across days without context bloat.
Why this matters: state is what the agent *learned*, not what it *said*. Persist the former, regenerate the latter.
- Persist a state vector, not full history
- Rebuild context per turn from the vector
- Track tool calls and learned facts
- Track approvals separately from prose
- Minimize what flows between turns
