Hacker News new | ask | show | jobs
by dofm 1 hour ago
Back in the earlier days of the internet, when "dedicated servers" were a competitive advantage, hobbyists and small dev shops definitely shared dedicated hardware.

So you could see small LLM co-operatives working out, yeah.

But my thinking is that this four-to-five-year scenario just won't come to fruition, because the whole concept of needing to run these massive, massive models will slightly more likely be rendered moot by smaller models with better reasoning capacity, and possibly even in that timescale by hardware innovations.

One of the biggest problems I have with the whole "we won't be profitable until 2030" model is that 2030 is almost exactly as far into the future as the launch of ChatGPT is in the past, and in that time, models far more capable than that first ChatGPT have been made available to freely download and run on desktop hardware that existed before it launched, and the entire non-model surrounding functionality of that original ChatGPT plus many more functions is now not much more than a routine weekend coding project.

I don't know why the market would entertain the idea that no upset like that is possible in the same period of time again.

2 comments

the biggest problem is most ai will be local by 2030. Every future device you buy will have AI compute on it somewhere, built in, like it has on-device floating point.

On top of this, people are constantly coming up with better ways of running models on less special hardware and "good enough" models are now existing for most tasks.

So where does that leave the frontier labs? Drug discovery? Maybe some hard math problems? I mean it's not that big actually...

We're in a brief window where this is profitable, like batch computing was in the 70s. However, once your own device can do it, you're going to start migrating.

> So you could see small LLM co-operatives working out, yeah.

Only on a pay-per-token basis, I think. Unless it's a very tight-knit circle of folks. Fixed monthly subscription costs I doubt would work in that model. Because you'll get the inevitable: someone pegging the service 24/7 because it's "unlimited" while everyone else suffers.

Well, many of us who shared hardware also ran monitoring to make sure the share was fair; there used to be a whole industry for that sort of quota stuff.

You can presumably hard-limit LLMs the same way — total, burst quotas etc.

(Suddenly getting a very fun flashback to the environment in which someone first explained Markov chains to me — MediaMOO. A text-based chat environment with configurable limits on the number of CPU "ticks" you were allowed in order to do things)