Hacker News new | ask | show | jobs
by mrazomor 687 days ago
What the OP is referring to requires overprovisioning of the high priority traffic and the sine-like utilization (without it, the benefits of the "batch" tier is close to zero -- the preemption is too high for any meaningful work when you are close to the top of the utilization hill).

You get that organically when you are serving lots of users. And, there's not much GPUs etc. used for that. Training LLMs gives you a different utilization pattern. The "best effort" resources aren't as useful in that setup.