Hacker News new | ask | show | jobs
by KeplerBoy 105 days ago
Xeons have a much longer shelf life and diverse workloads. If you order hardware specifically for LLM inference and then some new hardware/model combination is much better at that (which it will be, because a lot of people are working on that), you might be in trouble.

It's like setting up a warehouse of GPUs to mine bitcoin while others are switching to ASICs.

1 comments

Training you mean. Doing inference on last year's chip is probably ok, but training a frontier model on it is going to be a deal breaker.
No I mean inference. The idea is that inference demand will be massive and a race to the bottom with razor thin margins.

Training costs can be amortized over the entire lifetime of the model, but if you lose money on inference or can't offer competitive usage limits for subscribers, there's no amortizing that.

No it's all about having the top model first and training time is what's crucial. OpenAI has already shown willingness to bleed money for the sake of brand and we can expect that to continue.
OpenAI economics don't really work unless you happen to be OpenAI.