|
|
|
|
|
by joshcartme
317 days ago
|
|
Maybe I'm totally misreading this, but it seems like the post contradicts itself. At the beginning of the third paragraph: > Impressively, open source models have been able to quickly catch up to big labs. And then the beginning of the fourth: > Open-source has been lagging behind proprietary models for years, but lately this gap has been widening. Followed by a picture that is more or less inscrutable. |
|
The image has been fixed, and the point I'm making is that proprietary models are almost always ahead, and this gap is widening. OS models that are nearly at the same quality are usually distilled versions of proprietary models, or somehow get training data from them. Sometimes, after massive, expensive training runs models are open sourced anyway, and at some point that becomes unsustainable.
The difference between a top model and a model with a similar ELO might seem small, but the value of even a marginal increase in intelligence is extremely high--for example I only use the best coding model for coding, whatever the cost.
There's also lots of evidence that large labs are only getting started. In the past year, they have secured massive amounts of compute, which is still not utilized well. I expect lots of big training runs in the future, which will shift the gap further between OS and proprietary models.
The major problem for these companies is they spend hundreds of millions of dollars training a model, and then someone comes in the next day and distills something almost as good for far less money (still a VERY large sum of money.)
I don't know how this will be resolved long term.