|
|
|
Show HN: LLM agents that write Python to analyze execution traces at scale
(github.com)
|
|
5 points
by kayba
104 days ago
|
|
We combined Stanford's ACE (agents learning from execution feedback) with the Reflective Language Model pattern. Instead of reading traces in a single pass, an LLM writes and runs Python in a sandbox to programmatically explore them - finding cross-trace patterns that single-pass analysis misses. The framework achieved 2x consistency improvement on τ2-bench. |
|