|
|
|
|
|
by mmaunder
386 days ago
|
|
Models are getting smaller, faster, cheaper to make, reflecting on their own output, adding modes and running in more places. But they’re not getting much smarter because they can only be as smart as us and each other, because that’s where their training comes from. OpenAI is strongest in a world where models cost billions to train. A world filled with cheap open source models is their worst nightmare. This is what’s happening. So they have to pivot into being a product company and away from being a model company. |
|
That doesn't look to be true in general. AlphaGoZero didn't learn off smarter humans or smarter AI's (at all - it only trained against itself), yet it became better at playing some games than any existing AI or human.
To me it looks like the same thing has happened for LLM's in the one area they are truly good at: natural language processing. Admittedly they only learned to mimic human language by begin fed lots of human language, but they look at least as good at parsing and writing as any human now, and much, much faster at it. And admittedly they have plateaued at natural language processing. But that's not because of any inherent limitation in the level of intelligence an AI can achieve. It's because unlike playing Go there is a natural limit on good how you can get at mimicking anything, which is "indistinguishable".
The other things LLM's seem to be good at a lossy compression of all the text they have been trained on. I was floored when I ran a 16GB locally, and it could tell me things about my childhood town (pop: under 1000, miles away from anywhere). It didn't know a lot, but there isn't a lot out there about it on the internet, and it still astounds me it could compress the major points of everything it read on the internet down to 16GB. The information it regurgitated was sometimes wrong of course, but then you only expect to get a overview of a scene from a highly compressed JPEG. The details will be blurry or downright misleading.
What they are attempting to tack onto that is connecting the facts the LLM knows into a chain of thought. LLM aren't very good at that, and the improvements over the past few years look to be marginal, yet that is what is being hyped with the current models.
None of that detracts from your main point, which I think boils down to the rapid advancements in proprietary models have stalled. Their open source competitors aren't far behind, and if they have really stalled open source will catch up.
But that's only true for the natural language processing side. The shear compute required to keep a model up to date with the latest information in the internet means the model with the most resources behind it will regurgitate the most accurate information about what's on the internet today. Open source will always lose that race.