| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by platinumrad 54 days ago
	People love harping on this one, but model collapse hasn't turned out to be an issue in practice.

7 comments

xienze 54 days ago

“It’s been a whole year or two and nothing bad has happened, checkmate doomers!”

It’s pretty shocking how much web content and forum posts are either partially or completely LLM-generated these days. I’m pretty sure feeding this stuff back into models is widely understood to not be a good thing.

link

larodi 50 days ago

What do you imagine distillation being then?

link

HerbManic 54 days ago

It feels like if it does happen, it will take a lot longer to show up. Also, I doubt they would ship a model that turns out this corrupted stuff.

It wont mean we see the model collapse in public, more we struggle to get to the next quality increase.

link

Tanoc 54 days ago

There's been symptoms of it that have shown up such as the colloquially called "piss filter" and the the anime mole nose problem, but so far they've been symptoms rather than a fatal expression of a disease. That they are symptoms however shows they can be terminal if exploited properly and profusely. So far we haven't seen anyone capable of the "profusely" part.

link

larodi 54 days ago

Besides models get distilled for fun and profit all the time, which on its own does not support the theory of model collapse.

link

pigeons 54 days ago

It doesn't seem like anything has changed to preclude it as a possible outcome yet.

link

Aerroon 54 days ago

I don't really understand why model collapse would happen.

I understand that if I have an AI model and then feed it its own responses it will degrade in performance. But that's not what's happening in the wild though - there are extra filtering steps in-between. Users upvote and downvote posts, people post the "best" AI generated content (that they prefer), the more human sounding AI gets more engagement etc. All of these things filter AI output, so it's not the same thing as:

AI out -> AI in

It is:

AI out -> human filter -> AI in

And at that point the human filter starts acting like a fitness function for a genetic algorithm. Can anyone explain how this still leads to model collapse? Does the signal in the synthetic data just overpower the human filter?

link

autoexec 54 days ago

> Users upvote and downvote posts, people post the "best" AI generated content (that they prefer), the more human sounding AI gets more engagement etc. All of these things filter AI output

At the same time though AI generated content can be generated much much faster than human generated content so eventually AI slop downs out anything else. You only have to check the popular social media platforms to see this in action and AI generated posts are widely promoted and pushed on users the same way most web searches return results with AI generated pages ranked highly.

Humans can't keep up and companies are actively working to bypass the human filter and intentionally promote AI generated content.

link

ragall 54 days ago

The past is not a good predictor of future performance.

link