Hacker News new | ask | show | jobs
by VMG 10 days ago
Convince me

1. in order to run LLMs, especially the best ones, you need complicated devices which are expensive

2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot

It seems to me that it will always be more economical that the LLM-running devices are in a datacenter where it is easier to make sure they are always utilized

10 comments

If a model is substantially better than most humans at most tasks, the human isn't going to be able to perceive the difference between Claude Opus 7.7 and 8.7. Humans at some point aren't going to be able to perceive the difference on benchmarks either, because they are going to get wildly abstract.

AI vendors are really going to struggle to shift tokens far beyond the frontier of human capabilities. It's reasonable (not guaranteed) to assume that, if the trend of frontier models (doubling capabilities on benchmarks every n months) holds, then the same trend will hold for local models, and those local models will meet and exceed the perception frontier. This would mean a human cannot tell the difference between Mistral-Open-2030 and Claude Opus 2030.

That's a bunch of "ifs", but there's nothing exceptional about those "ifs". They're basically the scenario if nothing changes between now and ~2030 with regards to capabilities trend attainment.

The trend over the past three decades of personal computing has been for devices to become exponentially more powerful regardless of the actual computing needs of users. The excess computing power has famously been requested by projects such as SETI@Home and Folding@Home, and been exploited by bad actors for crypto mining. The most basic laptop today used only for web browsing and word processing would be a powerful workstation 20 years ago, when the most basic laptop was also used only for web browsing and word processing (and arguably for more things, as it was all mostly local software).

There is no ceiling to the power of consumer hardware. If it's cheap enough, it will be bought.

most crypto mining has moved to specialists, even where there were deliberate attempts to make it ASIC-resistant

SETI@Home is a very niche use case

and web browsing still happens by connecting to data centers and server farms, not by connecting to another laptop

I think you missed the point of my message. Web browsing still happens by connecting to data centres, so why are consumer laptops so much more (unnecessarily) powerful today than they were 20 years ago? All the more so given that, at that time, you were running MS Office locally rather than using Office 365 or Google Docs remotely.
This.

Even two or three years people were pointing out "The ChatGPT subscriptions you can buy with $2000 give you much more compute than whatever home setup you come up with" on r/LocalLLM. I did my own elementary school maths and came to the same conclusion.

Yet till this day people still boast how their beefy M4 Pro/Max machine with 32+GB RAM (which is not at all a "normal person's setup" and costs $2000+) runs LLMs smoothly, and "that's the future".

Someone needs to re-learn basic maths and take a walk around Best Buy to understand what "consumer laptop" looks like.

If there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage, compared to a data center that might cut off your access during peak hours or etc.

Think of it like having a graphics card at home versus using a cloud gaming stream? Technically subscribing to GeForce is much cheaper up front than getting a card, but people still do that. So will the audience of people running agents at home be as large as PC gaming? I think that's kind of plausible.

> if there end up being useful workflows where you keep stuff running in the background or overnight that's one advantage

That is not how LLMs are typically used though in my experience

> Think of it like having a graphics card at home versus using a cloud gaming stream?

Latency seems to be much more important in that use case

>2. if you buy one for your personal use, you are probably not going to utilize it all the time and it will be idle a lot

I think consumers are primed for that type of behaviour though. I have an iPhone on my desk. It has something like 2-3tflops CPU+GPU, which is double that of the largest super computer on earth when Jurassic Park came out, and is probably more computing power than existed on earth when I was born in the 80s.

I use this device for around 1hr per day to write text messages.

It's inevitable. What might be a prosumer device today priced at 4000$ will be a regular consumer device in 10 years and models only get better.

Local models today are fine for a lot of mundane tasks and will continue to be so. The use cases where paying for frontier models is worth it, will continue to shrink for folks not doing frontier work.

> models only get better.

Or stall. Acceleration has been slowing significantly and gains seem to be tied to huge memory footprints.

Uploading your IP to the biggest IP thieves in human history seems bad idk.

2. Eventually we'll get to where local models that don't have sycophancy and slot-machine mechanics trained into them will perform better.

3. If your device run on battery, why not using a relatively cheap network call in place of a very power hungry local inference call?
Privacy and offline use would affect the choice as well. How niche are they, I am not sure.
Just like cloud vs private server. It'll be based on use case.