Hacker News new | ask | show | jobs
by chiply314 2 hours ago
I think companies will fire 5-10% of people and convert them to token budget.

I also believe that before any real companies are running these models locally, they will already have some kind of agentic layer.

With the current frontier model lab progress, i do not see any real company which makes real money, running local models.

Running local models is easy for me, for sure not that easy for any company. Your DC needs to be able to host GPUs, it needs the cooling power, you need to have a DC. Without a DC, you need to have someone maintaining critical infrastrucutre, taking care of model evaluation etc.

For external parties, there might become a new business model: You might not hire an external anymore, but a token budget and the 'operator of the token budget'.

The current chip fabs are full, developing a high end / cheapisch local LLM Chip will still take a few years as long as the DC GPU demand is still as high as it is.

2 comments

I work with large enterprises that _only_ run critical workloads on locally hosted models. Think banks, insurance, etc--businesses that absolutely cannot leak any data. They also have CC and Codex, but their use is extremely restricted; anything of consequence runs on models running on GPU clusters in their own datacenter.
I work at large enterprise and they are happy paying Microsoft and AWS for model hosting.

But for sure there will be use cases of very critical data, but at the end the question will still be how big they are in comparision to the rest of the market.

These cricial workloads also have the cost issue, right? so will they reduce workforce to compensate for the budget?

I am calling it now. LLM hosting is the new web hosting. You will have a market of hosting providers offering you access to LLM compatible hardware (the Hetzners of the LLM world) as well as virtualised LLM access (the Heroku of the LLM world). These will compete along pricing, ownership axes while frontier labs will compete mostly on performance, integration and ease of use (think Wordpress).

That's the only way I can see frontier labs charging high enough to sustain the cash flow needed to operate as racing to the bottom is not possible for them.

It is interesting to think whether this is another "Cambrian" era like the smartphone OSes when you had Symbian, Android, iOs, Windows Mobile and so many others competing.

I work at a very big company and they just pay azure and aws to host claude and co for them.

So the hyperscalers already won for now probably.

At the end of the day, you send a lot of personal data to these endpoints. If you already host everything through microsoft already, LLM hosting is then a no brainer.