Hacker News new | ask | show | jobs
by preommr 5 days ago
> The opposite of that has been happening for 20 years now with cloud compute. It won't happen with AI models either.

AI is different.

Cloud computing genuinely is cheaper on average. It's better than paying for cisco servers, and at scale, it's cheaper than managed platforms (ala Heroku), and it's a coin toss for when you're in the middle ground and constantly approaching the point of rebuilding poor-man versions of existing products but with very very expensive engineering salaries.

In contrast, local models offer dramatic savings, and are magnitude of orders better in certain aspects: like stability - the performance is all over the place with traditional AI companies as they divert compute to their next big thing.

The benefits to maintaining your own infrastructure are pretty moderate to low, with very high risk.

And also, alternate models are pretty easy to use and easy to swap out unlike the vendor lock-in that exists with cloud services.

5 comments

> AI is different.

I agree. The other thing here is that, once you can run LLMs on a single piece of commodity hardware (whether that includes one GPU or several), the difference between cloud vs. on-premise LLMs will largely be about where your hardware is located. There will be very little software configuration involved (just an HTTP endpoint that talks to the GPU). This is decidedly different from cloud products where the moat of hyperscalers is largely in the software and services on top of the hardware, not the hardware itself. (Sure, GPUs will eventually break & need replacement, too, but there's no state to lose, so that's already orders of magnitude easier than replacing hard drives.)

There's also a difference in the cost of downtime. A server hosting your website or SaaS, if it's down for five minutes, costs you a lot of real revenue. So you plan for redundancy, you set up automatic failover so that if one node goes down the next node can handle the load while the first one reboots, and so on. But for the LLM that's just serving your local model? You can tell everyone "Hey, we're taking it down for a 15-minute window, so plan your lunch break while it's down". Unplanned downtime can interrupt what people were doing and cost you productivity and thus money, but it's a lot easier to schedule planned downtime and have people work on non-model-using tasks during those periods: the model is helpful, but not essential.
There's no economic reason why running a model locally should be better than using a cloud hosted version.
“There is no reason anyone would want a computer in their home." - Ken Olson, Founder of Digital Equipment Corporation, in 1977
In hindsight this is getting truer, what with the push of dumb terminal for everyone
Everyone has at least one in their pocket right now though.
Sure there is. Keeping your IP in house.
You pay a 3x markup to rent a server through AWS than managing your own. You pay for convenience. At shall annals that's fine, but for large companies with their own datacenters, you generally do things in house.
> Cloud computing genuinely is cheaper on average.

For some applications, sure. Availability is a large part of what one is paying for with cloud computing, but it's also something that not every business needs.

If you sacrifice availability and have a pure-compute use case (low durability requirements), on-prem can quickly end up cheaper for far better hardware.

AI is different because you can't encrypt it. An running on someone else's hardware is basically just 'trust me bro! I won't read it!'. Of course you can say that about I.e. database too, but at least you can run it on your own dedicated hardware in some datacenter, so it is password protected, you can encrypt it at rest and you will only know the key.

With AI, no, you can't . model needs plain text to be able to work. If somebody will be able to figure out models with asymmetric keys will make a lot of money.