| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by enraged_camel 10 days ago
	Spawning separate agents to review the original agent's implementation results in a very noticeable increase in code quality and decrease in bugs. This is why I encode two or three rounds of sub-agent review during the planning process, where I tell the agent authoring the plan to include those review rounds at the end. If the code is particularly load-bearing, I then ask a fourth agent, usually from the other frontier lab. All of this burns more tokens of course, but probably way less than coming back to the code later to fix bugs. It is also slower, but in the long run saves time.

1 comments

yaodub 10 days ago

Have you found integrating outputs from different frontier labs consistently improves final results, or is it just kind of voodoo?

link

enraged_camel 10 days ago

It's useful, but increases review time and mental energy requirement. Often times Codex and Opus will find the same issues when given a review task, but will disagree on issue severity. Codex might claim that something is a blocker, while Opus will say it's just a medium/low. Or vice versa.

link

yaodub 9 days ago

Same with legal questions, tbh. Spots the same issues, completely disagree on which ones matter. Maybe you just need a third model to choose between the outputs lol

link