Hacker News new | ask | show | jobs
by enraged_camel 7 days ago
It's useful, but increases review time and mental energy requirement. Often times Codex and Opus will find the same issues when given a review task, but will disagree on issue severity. Codex might claim that something is a blocker, while Opus will say it's just a medium/low. Or vice versa.
1 comments

Same with legal questions, tbh. Spots the same issues, completely disagree on which ones matter. Maybe you just need a third model to choose between the outputs lol