| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by simonw 13 hours ago

It doesn't make sense to include the capex cost to train a model in this kind of discussion, because that cost is fixed.

Consider a model that costs $100m to train.

If the vendor then prices it such that each inference token has a margin of 10% over the variable costs to serve (power + server costs), whether or not they cover their costs is based entirely on how many tokens they can sell.

If they sell less than $1bn of tokens, they lose money - the break even point is 10x100m = $1bn.

If they sell $10bn of tokens they make a ton of money.

This also means you can't credibly calculate how much of the fixed training expense is covered by your token spend, because until the model is retired and you can account for how much inference it ran you don't know what percentage of the training cost each sold token was responsible for.

2 comments

vb-8448 13 hours ago

Cost is fixed if you train a model once in several years, if you have to train 3/4 times per year to stay competitive training cost is a thing.

You have to include also failed training sessions and experiments in the math.

There are no official figures but given how fast new models are rolled out, I wouldn't be surprised if neither Anthropic nor OAI manage to cover the full models cost.

link

frotaur 13 hours ago

I think the capex being fixed assumes you can just stop training the next model. But its not clear that you can afford to do that and keep selling tokens.

And if capabilities plateau such that training the next one is useless, then the margins will drop fast due to competition.

link

ACCount37 13 hours ago

Model inference:training compute for frontier models is estimated to be over 10:1 now.

Driven mostly by just how much inference they sell nowadays - but also by things like base model reuse.

link