Hacker News new | ask | show | jobs
by webpolis 101 days ago
Loop detection is the harder problem and almost nobody solves it well at the infra layer. The pattern you're describing — write, fail, "fix," regress — is fundamentally a semantic loop, not a mechanical one. You can't catch it by diffing outputs or counting retries because each iteration looks superficially different. The agent thinks it's making progress.

What actually works in my experience: budget the agent a rolling token window per subtask, not per session. If it burns 40% of its context on a single function without a passing test, force an early return with the error context and let the orchestrator decide whether to retry with a different prompt or bail. Putting that logic in the VM host is tempting but wrong — the host doesn't have the semantic context to know "stuck" from "legitimately hard." That belongs in the agent framework or a thin supervisor shim between the framework and the execution environment.