|
|
|
|
|
by Tuna-Fish
2 hours ago
|
|
It won't make sense to run them after two years. The vendors will be limited on datacenter space, power and cooling, and there will be new hardware available that will run the same models at a fraction of the power. A100 -> H100 was >3x tokens per joule, H100 -> B200 >10x. There are significant low-hanging fruit still available in architectural efficiency, and the vendors are chasing them. This is the big risk for AI companies that I feel is not being sufficiently priced in. Almost none of the investments they are making are durable, the depreciation schedules for everything but the real estate should be less than 24 months. Until the hardware is stable enough that you only get double-digit % improvements per generation, it should almost be counted as opex. |
|
As it stands there's way more demand than supply. The new GPUs are going to run frontier models while the older ones serve smaller ones.
That said some of these are running in tents hooked up to mobile turbines. I can see some of those going away but generally I think you'll see them used until they start to fail in 5-10 years.