Hacker News new | ask | show | jobs
by eugeneonai 13 days ago
Looks nice. The README pitches model-per-task picking but doesn't say much about context management. In coding-agent loops the full system prompt + tool specs re-send on every step — a 30-step task pays the input cost 30x. Prompt-cache headers catch the static prefix, but the per-step diff (file diffs, observation tokens) isn't cached, and that's often most of the input. Auto-summarizing older trajectory into a state vector saved 40-60% input tokens in workloads I've looked at — could be a useful daemon-side concern since users won't reach into each agent's internals.