Hacker News new | ask | show | jobs
by selcuka 2 hours ago
> the world really has moved on to open models

Don't get me wrong: I'm all for open models, but I think it will get more and more difficult to distil-train them without (legitimate) access to frontier models.

3 comments

I’m not sure, because the same thing happened with facebook advertising restrictions during the 2018 elections and nowadays there’s a whole black market for fake ad accounts.

If anything I bet these people will just use their knowledge to make even more money reselling tokens.

As if all progress done in open models is because of distilling...

People have no idea and everybody pretends to be an expert and ignore how good China is on AI research

Personally, I find it rather humorous that we've moved from the fear that AI generated output would corrupt training to the idea that it is essential to training. Reality itself has not just a left bias but a bias to fundamentals. Bootstrap from fundamentals without introducing arbitrary error and you have the superior system; it just may not be highly compatible with a trash ecosystem.
I mean, I'm not sure that's the correct read on this.

If you want an Opus class model, it makes sense that you would train on what Opus outputs. But, if you want something better than Opus, training on the same data that Opus was trained on with the same architecture will only result in an Opus class model. Then, if your dataset also contains Opus outputs, many of which are wrong, then it makes sense that the model would have reduced performance.

All this to say that I don't think there's such a thing as a "Model Collapse," but there likely is a "Model Stagnation."

A model trained on all the data X was trained on should be improved to the extent that X is already out of date. A model trained on X itself has all the errors of X and all of it's own. Society itself seems to show that model collapse is entirely possible today and was presumably a problem in the past given the significance placed on citation and going to original sources that predates obsession with credit.