Hacker News new | ask | show | jobs
by spicyusername 65 days ago
On the one hand open source projects are going to be overrun with AI code that no one reviewed.

On the other hand, code produced with AI and reviewed by humans can be perfectly good, maintainable, and indistinguishable from regular old code.

So many processes are no longer sufficient to manage a world where thousands of lines of working code are easy to conjure out of thin air. Already strained open source review processes are definitely one.

I get wanting to blanket reject AI generated code, but the reality is that no one's going to be able to tell what's what in many cases. Something like a more thorough review process for onboarding trusted contributors, or some other method of cutting down on the volume of review, is probably going to be needed.

4 comments

>reviewed by humans can be perfectly good, maintainable, and indistinguishable from regular old code

That depends on the 'regular old code' but most stuff I have seen doesn't come close to 'maintainable'. The amount of cruft is proper.

Another good example of "the people writing good code with AI are the people who could have done it regardless"
> On the other hand, code produced with AI and reviewed by humans can be perfectly good, maintainable, and indistinguishable from regular old code.

I have yet to see a single example of this. The way you make AI generated code good and maintainable is by rewriting it yourself.

I know it's unpopular to say (here), but I see it all the time. Myself I sometimes cannot recognize what I wrote and what the agent wrote. It's just that I often have a physical memory of typing it, but that's it. (I also saw a lot of garbage, to be fair.)

There is quite a bit of skill to it, however. You cannot just take an AI from blank to "good code" without doing work. Yes, it takes work and quite a bit of it. By this I mean you have to write a good code style guide and a proper explanation of your architectural style(s), your preferences, your goals, plenty of examples, etc. Proper thought has to be put into this.

If you come across bad code, you need to investigate not castigate: why did this happen? How can we prevent this in the future? Those sort of processes need to become second nature. They actually should be already, because it's not that much different from managing a bunch of humans.

Humans come with lots of implicit knowledge and you also select them to match your company's style when you're hiring them. When they sit down at their keyboards you (and society) has already guided them towards a desirable path. (And even then they often still misfire.)

AI agents operate different. Their range of expression is completely alien to us. We cannot be both von Neumanns and complete morons. LLMs have no problem there. It takes a good while to get used to that.

A policy like this has two points. One, to give good faith potential contributors a guideline on what the project expects. Two, to help reviewers have a clear policy they can point to to reject AI slop PRs, without feeling bad or getting into conflicts about minutiae of the code.
Right, "good faith" is a key idea that is being ignored. If you want to lie to the lead SDL maintainers and claim your code is 100% human-written, you can probably get away with it. But that is unethical and cynical behavior in pursuit of an astonishingly petty goal. And it's correct for SDL to simply ignore the contribution because it came from a dishonest developer, even if the specific code appears to be very good.
> On the other hand, code produced with AI and reviewed by humans can be perfectly good and indistinguishable from regular old code.

Obligatory xkcd:

https://xkcd.com/810/