| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thepasch 62 days ago
	It depends on how you review. In an orchestrated per-task review workflow with clearly defined acceptance criteria and implementation requirements, using anything other than Sonnet (handed those criteria and requirements) hasn’t really led to much improvement, but it drives up usage and takes longer. I even tried Haiku, but, yeah, Haiku is just not viable for review, even tightly scoped, lol. Siccing Sonnet on a codebase or PR without guidance does indeed lead to worse results than using Opus, though.

1 comments

K0balt 61 days ago

That makes sense, if your scope is tight enough, good enough is good enough. I’ve got the expected specifications and code style guides, including some aerospace engineering ones, but in complex systems I still run into difficult to sus out corner cases where the code works but the system breaks, usually due to unresolved conflicts in operational requirements.

link

thepasch 60 days ago

There’s definitely a ceiling for what LLMs are capable of, and I think aerospace engineering might just currently be it, haha.

link

K0balt 57 days ago

Lol yeah, I don’t think I’m ready to ride in the jet that Claude built lol. I should clarify that I use the code guidelines because they are solid guardrails for making things that perform predictably, not because I’m building MCAS lol. Let’s hope that “vibe aerospace engineering” is a way off for now.

link