| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by einpoklum 9 days ago

> Today’s coding benchmarks have established that models can write correct code.

I wouldn't say that.

> But as AI-generated code becomes the dominant path to production

I really hope that's not the case.

1 comments

zakisaad 9 days ago

How do you define "correct" code?

link

newsicanuse 9 days ago

The code that gets stuff done instead of beating around the bush making unxpected errors

link

vanuatu 9 days ago

i suspect this is highly dependent on what you're working on

from my experience if you give the models a way to self-verify correctness they succeed basically 100% of the time

link

maccard 9 days ago

> from my experience if you give the models a way to self-verify correctness they succeed basically 100% of the time

My experience is that if you can get the model to one shot the task, you'll do fine but if it has to iterate it leaves things worse than before and almost always requires human intervention after burning through an enormous amount of tokens

link