| I think companies will fire 5-10% of people and convert them to token budget. I also believe that before any real companies are running these models locally, they will already have some kind of agentic layer. With the current frontier model lab progress, i do not see any real company which makes real money, running local models. Running local models is easy for me, for sure not that easy for any company. Your DC needs to be able to host GPUs, it needs the cooling power, you need to have a DC. Without a DC, you need to have someone maintaining critical infrastrucutre, taking care of model evaluation etc. For external parties, there might become a new business model: You might not hire an external anymore, but a token budget and the 'operator of the token budget'. The current chip fabs are full, developing a high end / cheapisch local LLM Chip will still take a few years as long as the DC GPU demand is still as high as it is. |