|
|
|
|
|
by mNovak
14 days ago
|
|
The wild thing to me, is that they're serving $47B run rate worth of requests on maybe 2-3 GW of compute currently [1], of which only a fraction goes to inference, vs R&D and training. Obviously there have been complaints on token limits and such so they're stretched a bit thin, but nonetheless. Hard to imagine what a world with 100GW of compute looks like. [1] https://epochai.substack.com/p/frontier-labs-dont-use-most-a... ^^ This quotes 1.4GW at the end of 2025. Add 0.3GW at Colossus 1, and some initial fraction of 1GW Trainium2 from [2] [2] https://www.anthropic.com/news/anthropic-amazon-compute |
|
I think token counts and GW are a gross over simplification here. Not all tokens are the same in the amount of GPU time they consume or the size of the GPUs they require or the amount of energy they consume. There's a huge optimization potential here once these companies get serious about consolidating the business they have and executing much more efficiently. Given enough time, these companies can heavily optimize their operations. Short term growth and not slamming the brakes on that is their primary concern.