|
|
|
|
|
by atq2119
22 days ago
|
|
It's been a while since I saw a detailed paper on a high end training run, but extrapolating from what I remember, it seems those training runs are in the 10s of trillions of tokens. This already accounts for potentially sampling tokens multiple times during the training run. That seems like a large number, until you realize that OpenAI claims to have almost a billion weekly users. And OpenRouter shows many models at over a trillion tokens per week. So in pure token terms, I'd say it is in fact extremely plausible that inference dominates, at least for the popular models. |
|