Hacker News new | ask | show | jobs
by dwringer 944 days ago
I'm not sure... hundreds or thousands of code repositories with subtle issues sounds like... the real world of code repositories. And I'd think through analogy and redundancy of some common algorithms, the LLM trained that way might conceivably be able to FIX many of those errors.
1 comments

Someone should build a poc. Ai doesnt know things other than what it’s ingested. So for such an attack to be successful you’d need to tilt the statistic towards problematic code. You’d need loads and loads of repositories but its definitely doable.
I don't know about that.

There's a famous 2006 Google Research blog post titled "Nearly All Binary Searches (...) are Broken" [1] due to a commonly occurring bug when implementing binary search.

glibc still has that bug [2].

I just asked ChatGPT 4 to write an implementation of binary search in C and it wrote a bug-free version on the first try.

I mean, this is not conclusive evidence, but I find it conceivable that an AI which despite being trained with buggy code, can still incrementally learn what the different coding constructs actually do, would be able to write more bug-free code than what it was trained with...

[1] https://blog.research.google/2006/06/extra-extra-read-all-ab...

[2] https://sourceware.org/bugzilla/show_bug.cgi?id=2753

Some classic Ulrich drepper in there.