Hacker News new | ask | show | jobs
by adjfasn47573 104 days ago
> Even if we assume LLMs would consistently generate good enough quality code, code submitted by someone untrusted would still need detailed review for many reasons

Wait but under that assumption - LLMs being good enough - wouldn't the maintainer also be able to leverage LLMs to speed up the review?

Often feels to me like the current stance of arguments is missing something.

4 comments

> Wait but under that assumption - LLMs being good enough - wouldn't the maintainer also be able to leverage LLMs to speed up the review?

This assumes that AI capable of writing passable code is also capable of a passable review. It also assumes that you save any time by trusting that review, if it missed something wrong then it's often actually more effort to go back and fix than it would've been to just read it yourself the first time.

A couple weeks ago someone on my team tried using the experimental "vibe-lint" that someone else had added to our CI system and the results were hilariously bad. It left 10 plausible sounding review comments, but was anywhere from subtly to hilariously wrong about what's going on in 9/10 of them. If a human were leaving comments of that quality consistently they certainly wouldn't receive maintainer privileges here until they improved _significantly_.
This is not even about capabilities but responsibility. In an open source context where the maintainers take no responsibility for the code, it's perhaps easier. In a professional context, ultimately it's the human who is responsible, and the human has to make the call whether they trust the LLM enough.

Imagine someone vibe codes the code for a radiotherapy machine and it fries a patient (humans have made these errors). The developer won't be able to point to OpenAI and blame them for this, the developer is personally responsible for this (well, their employer is most likely). Ergo, in any setting where there is significant monetary or health risk at stake, humans have to review the code at least to show that they've done their due diligence.

I'm sure we are going to have some epic cases around someone messing up this way.

It was maybe not quite clear enough in my comment, but this is more of a hypothetical future scenario - not at all where I assess LLMs are today or will get to in the foreseable future.

So it becomes a bit theoretical, but I guess if we had a future where LLMs could consistently write perfect code, it would not be too far fetched to also think it could perfectly review code, true enough. But either way the maintainer would still spend some time ensuring a contribution aligns with their vision and so forth, and there would still be close to zero incentive to allow outside contributors in that scenario. No matter what, that scenario is a bit of a fairytale at this point.

You can not trust the code or reviews it generates. You still have to review it manually.

I use Claude Code a lot, I generate a ton of changes, and I have to review it all because it makes stupid mistakes. And during reviews it misses stupid things. This review part is now the biggest bottleneck that can't yet be skipped.

An in an open source project many people can generate a lot more code than a few people can review.