Hacker News new | ask | show | jobs
by RodgerTheGreat 736 days ago
I truly hope the reckless enthusiasm for LLMs will cool down, but it seems plausible that discretized, compressed versions of today's cutting-edge models will eventually be able to run entirely locally, even on mobile devices; there are no guarantees that they'll get better, but many promising opportunities to get the same unreliable results faster and with less power consumption. Once the models run on-device, there's less of a financial motivation to pull the plug, so we could be stuck with them in one form or another for the long haul.
2 comments

I don't believe this scenario to be very likely because a lot of the 'magic' in current LLMs (emphasis on 'large') is derived from the size of the training datasets and amount of compute they can throw at training and inference.
Llama 3 8B captures that 'magic' fairly well and runs on a modest gaming PC. You can even run it on an iPhone 15 if you're willing to sacrifice floating point precision. Three years from now I full expect GPT4 quality models running locally on an iPhone.
Three years is more than twice the time since GPT-4 was released to now. Almost twice the time ChatGPT existed. At this rate, even if we'll end up with GPT-4 equivalents runnable on consumer hardware, the top models made available by big players via API will make local LLMs feel useless. For the time being, the incentive to use a service will continue.

It's like a graphics designer being limited to chose between local MS Paint, and Adobe Creative Cloud. Okay, so Llama 3 8B, if it's really as good as you say, graduates to local Paint.NET. Not useless per se, but still not even in the same class.

No one knows how it will all shake out. I'm personally skeptical scaling laws will hold beyond GPT4 sized models. GPT4 is likely severely undertrained given how much data facebook is using to train their 8B parameter models. Unless OpenAI has a dramatic new algorithmic discovery or a vast trove of previously unused data, I think GPT5 and beyond will be modest improvements.

Alternatively synthetic data might drive the next generation of models, but that's largely untested at this point.

The one thing people overlook is the user data on ChatGPT. That's OpenAI's real moat. That data is "free" RLHF data and possibly, training data.
I know this isn’t really the point, but Adobe CC hasn’t really improved all that much from Adobe CS, which was purely local and perfectly capable. A better analogy might be found in comparing Encyclopedia Brittanica to Wikipedia. The latter is far from perfect, but an astounding expansion of accessible human knowledge that represents a full, worldwide paradigm shift in how such information is maintained, distributed, and accessed.

On the same token, those of us who are sufficiently motivated can maintain and utilize a local copy of Wikipedia…frequently for training LLMs at this point, so I guess the snake has come around, and we’ve settled into a full-on ouroboros of digital media hype. ;-)

They're extremely pessimistic, 3 years is 200% of how long it took ChatGPT 3.5.

Llama 8B is ChatGPT 3.5 (18 months before L3), running on all new iPhones released since October 2022, (19 months before L3). That includes multimodal variants (built outside Facebook).

Just imagine if you had an accurately currated dataset.
I just want to sit down in front of my TV, put on my Bluetooth headphones and have the headphones and TV connect automatically.

Then, when I’m downstairs in my office and want to listen to music on my iPhone. I want my headphones to connect to my iPhone and not my TV upstairs!

I don’t need Skynet, I just need my devices to be a little less stupid.

I would consider that akin to magic at this point. Let’s start there and work our way up to handing over control of our nuclear arsenal.

The University of Washington is studying an AI application where a pair of headphones will isolate a single voice in a crowd when one simply looks at them. Amazing stuff…until you try it anywhere near your car, and then it starts playing the voice over your car stereo (presumably).