Hacker News new | ask | show | jobs
by TeMPOraL 734 days ago
Three years is more than twice the time since GPT-4 was released to now. Almost twice the time ChatGPT existed. At this rate, even if we'll end up with GPT-4 equivalents runnable on consumer hardware, the top models made available by big players via API will make local LLMs feel useless. For the time being, the incentive to use a service will continue.

It's like a graphics designer being limited to chose between local MS Paint, and Adobe Creative Cloud. Okay, so Llama 3 8B, if it's really as good as you say, graduates to local Paint.NET. Not useless per se, but still not even in the same class.

3 comments

No one knows how it will all shake out. I'm personally skeptical scaling laws will hold beyond GPT4 sized models. GPT4 is likely severely undertrained given how much data facebook is using to train their 8B parameter models. Unless OpenAI has a dramatic new algorithmic discovery or a vast trove of previously unused data, I think GPT5 and beyond will be modest improvements.

Alternatively synthetic data might drive the next generation of models, but that's largely untested at this point.

The one thing people overlook is the user data on ChatGPT. That's OpenAI's real moat. That data is "free" RLHF data and possibly, training data.
I know this isn’t really the point, but Adobe CC hasn’t really improved all that much from Adobe CS, which was purely local and perfectly capable. A better analogy might be found in comparing Encyclopedia Brittanica to Wikipedia. The latter is far from perfect, but an astounding expansion of accessible human knowledge that represents a full, worldwide paradigm shift in how such information is maintained, distributed, and accessed.

On the same token, those of us who are sufficiently motivated can maintain and utilize a local copy of Wikipedia…frequently for training LLMs at this point, so I guess the snake has come around, and we’ve settled into a full-on ouroboros of digital media hype. ;-)

They're extremely pessimistic, 3 years is 200% of how long it took ChatGPT 3.5.

Llama 8B is ChatGPT 3.5 (18 months before L3), running on all new iPhones released since October 2022, (19 months before L3). That includes multimodal variants (built outside Facebook).