|
|
|
|
|
by simonw
13 hours ago
|
|
It doesn't make sense to include the capex cost to train a model in this kind of discussion, because that cost is fixed. Consider a model that costs $100m to train. If the vendor then prices it such that each inference token has a margin of 10% over the variable costs to serve (power + server costs), whether or not they cover their costs is based entirely on how many tokens they can sell. If they sell less than $1bn of tokens, they lose money - the break even point is 10x100m = $1bn. If they sell $10bn of tokens they make a ton of money. This also means you can't credibly calculate how much of the fixed training expense is covered by your token spend, because until the model is retired and you can account for how much inference it ran you don't know what percentage of the training cost each sold token was responsible for. |
|
You have to include also failed training sessions and experiments in the math.
There are no official figures but given how fast new models are rolled out, I wouldn't be surprised if neither Anthropic nor OAI manage to cover the full models cost.