| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ttiurani 381 days ago

Solid point.

A bit tangential, but I've recently been baffled as to why is this belief held by so many:

"since LLMs are trained on online data, they are re-trained with their own AI slop, further damaging them. Soon bad code will get even worse, basically."

together with:

"I’m sure it will get better with time, but it’s not there yet."

What improvements people think will overcome the poisoning of the well? I'm not an expert at all, but to me it feels like we'd need a new breakthrough to get good output from garbage input.

1 comments

JustinCS 381 days ago

Even as AI generates more writing and code, we still have a way of ranking quality: Good writing and successful projects tend to get more popular and prominent. This selection can allow LLMs to continue to improve. They get a huge flow of slop, but they generate based on the patterns correlated with better quality. The model developers can also develop better ways to curate the input data themselves and keep the slop at bay. It's not a guaranteed or trivial mechanism, but I don't think we need a new breakthrough either.

link

ttiurani 381 days ago

Maybe.

As a counterpoint: isn't popularity of a library more a metric of API convemience than actual code quality?

And isn't popularity of an essay more about how it conforms to existing beliefs than the quality of the thinking?

link

JustinCS 381 days ago

Those are good points and that's why progress is not guaranteed or trivial, just plausible.

link