> Every decision is traceable to one context window
There are no models that can do all the mentioned steps in a single usable context window. This is why subagents or multi-agent orchestrators exist in the first place.
You're right that no model handles everything in one context window — that's exactly why I built context rotation. Each task runs in a single agent context (one responsibility, clear scope), and when the window fills up, the system automatically rotates: writes a structured handover, clears, and resumes in a fresh window.
The key distinction: sub-agents run within a parent context with shared state (black box). My approach uses independent parallel agents (separate terminals, separate context windows) that report back to an orchestrator. Large tasks get split into smaller dispatches upfront — each scoped to fit a single context window. The orchestrator can dispatch research to 3 agents in parallel, collect their outputs, then dispatch a synthesis task to a single agent that merges the findings.
So it's not "one context window for everything" — it's right-sized tasks with full observability per agent, and a governance layer managing the sequence and merging results.
T0 (orchestrator) | T1 (Track A)
T2 (Track B) | T3 (Track C)
When a worker finishes, it writes a structured report to a shared unified_reports/ directory. A file watcher (receipt processor) detects it, parses the report into a structured NDJSON receipt (status, files changed, open items, git ref), and delivers it to T0's pane.
T0 then reviews the receipt, runs a quality advisory (automated pass/warn/hold verdict), and decides: close open items, complete the PR, or redispatch.
Everything is filesystem-based — no API, no database, no shared memory between agents. Each terminal has its own context window, its own Claude Code (or Codex/Gemini) session, and the only communication channel is structured files on disk.
The receipt ledger is append-only NDJSON, so you can always trace: which agent did what, when, on which dispatch, with which git commit.
I open-sourced the setup recently if you want to dig into the details.
The key distinction: sub-agents run within a parent context with shared state (black box). My approach uses independent parallel agents (separate terminals, separate context windows) that report back to an orchestrator. Large tasks get split into smaller dispatches upfront — each scoped to fit a single context window. The orchestrator can dispatch research to 3 agents in parallel, collect their outputs, then dispatch a synthesis task to a single agent that merges the findings.
So it's not "one context window for everything" — it's right-sized tasks with full observability per agent, and a governance layer managing the sequence and merging results.