Oh I am not talking about the cards becoming obsolete, that is a concern, but the main issue is that GPUs fail in large numbers after a few years in datacenters.
That is mostly because they are run 24/7 at the peak of their thermal envelopes and eventually components fail.
The comments to his tweet, if true, tend to say that the real lifespan of an AI chip tends to be around 1 to 3 years in reality, since racks don't cool down that well. Not sure if these commenters are a reliable source though lol.
https://x.com/xdire_me/status/1987920424978837711
Yeah, but this is partly due to there being a shortage of entry level GPUs for consumers. NVIDIA has literally stopped manufacturing them.
There are massive numbers of data centre GPUs sitting in hyperscaler warehouses waiting to be deployed in a data centre. They may never be deployed because there’s more GPU than DC space and you want your most efficient GPUs in the active slots.
That is mostly because they are run 24/7 at the peak of their thermal envelopes and eventually components fail.