Hacker News new | ask | show | jobs
by enraged_camel 10 days ago
Spawning separate agents to review the original agent's implementation results in a very noticeable increase in code quality and decrease in bugs. This is why I encode two or three rounds of sub-agent review during the planning process, where I tell the agent authoring the plan to include those review rounds at the end. If the code is particularly load-bearing, I then ask a fourth agent, usually from the other frontier lab.

All of this burns more tokens of course, but probably way less than coming back to the code later to fix bugs. It is also slower, but in the long run saves time.

1 comments

Have you found integrating outputs from different frontier labs consistently improves final results, or is it just kind of voodoo?
It's useful, but increases review time and mental energy requirement. Often times Codex and Opus will find the same issues when given a review task, but will disagree on issue severity. Codex might claim that something is a blocker, while Opus will say it's just a medium/low. Or vice versa.
Same with legal questions, tbh. Spots the same issues, completely disagree on which ones matter. Maybe you just need a third model to choose between the outputs lol