|
|
|
|
|
by K0balt
61 days ago
|
|
In my experience sonnet<opus by a long shot for code review. Sonnet often flags things as errors that are not, because it fails to grasp the big picture… and also fails to grasp structural issues that are perfectly coded and only show up as problems at the meta scale. I have no reason to believe that the next generation won’t offer similar gains in verification, and there is some evidence to support that the cybersecurity implications are the result of exactly this expansion of ability. |
|
Siccing Sonnet on a codebase or PR without guidance does indeed lead to worse results than using Opus, though.