Hacker News new | ask | show | jobs
by throwawayffffas 16 days ago
Additionally the internet bubble left us a legacy of installed fiber that remained mostly unused for almost a decade. This time around all the capital intensive stuff have an expiration date, gpus have a short training lifespan (4-5 years). Models are outdated the moment their training is complete.
9 comments

4-5 years for GPU being outdated is a bit ... outdated. 3090 from 2020 still get sold for more than the release price
Oh I am not talking about the cards becoming obsolete, that is a concern, but the main issue is that GPUs fail in large numbers after a few years in datacenters.

That is mostly because they are run 24/7 at the peak of their thermal envelopes and eventually components fail.

There is also a significant amount of accounting fraud happening right now, according to Michael Burry: https://x.com/michaeljburry/status/1987918650104283372?lang=...

The comments to his tweet, if true, tend to say that the real lifespan of an AI chip tends to be around 1 to 3 years in reality, since racks don't cool down that well. Not sure if these commenters are a reliable source though lol. https://x.com/xdire_me/status/1987920424978837711

Yeah, but this is partly due to there being a shortage of entry level GPUs for consumers. NVIDIA has literally stopped manufacturing them.

There are massive numbers of data centre GPUs sitting in hyperscaler warehouses waiting to be deployed in a data centre. They may never be deployed because there’s more GPU than DC space and you want your most efficient GPUs in the active slots.

RTX A6000 or A100 from 2020 also sells for more than the release price
1. We are talking about datacenter GPUs here, not consumer ones.

2. Datacenters are currently extremely power-limited. Efficiency is king.

Also thinking about it. Fibre was in the ground. It had minimal storage costs. Same can't really be said about buildings and hardware there which has ongoing costs even if turned off. Storage alone has cost involved at this scale. Warehouses can be relatively expensive. So there is also that sort of aspect.
Yeah, I think there will be much more waste when the bubble finally pops & it will be harder to recover valuable stuff.

Imagining people buying scrap AI hardware from creditors or bankruptcy auctions & harvesting all the HBM RAM chips and NAND storage chips to sell & throwing away the useless AI optimized compute chips and unusable enterprise interconnects.

the ~10x/year drop in inference cost makes the capex depreciation cycle even harder — a cluster that's profitable today may not pencil out in 18 months
Are the datacenters that are being built not directly analogous? Even if the hardware in them is cooked after 5 years, the buildings, power, cooling, and fiber interconnects are still all valuable.

The models may go out of date but the process and software are continuously improving.

Partially, the GPUS represent about two thirds of the datacenter cost. Hopefully the legacy is going to be a large market of second hand and refurbished datacenter gpus that will democratize compute. We are already seeing Nvidia H100s and AMD MI250s hit the secondary market.
My thoughts exactly. At most you get a surplus of cheap third tier AI. Which may or may not be helpful. And or a bunch of unused unmaintained deteriorating data center buildings.
It was only really the US that was left with the legacy of installed fibre.

The 2000 crash left a lot of broken economies worldwide. Many non-US stock markets benefitted from the tech stock feeding frenzy without the investment actually being used to build anything.

If the AI bubble pops, a handful of US megacorps may be left with good models, datacentres and other assets, but the economic shocks will be felt around the world.

4-5 years is not short? Don't companiess write off their hardware after 3 years mostly anyway?
It's short compared to the previous bubbles. The capital in the previous bubbles went into things that survived the bubble, networking infrastructure and rail networks.
If you plan to take out a 10-15 year loan to buy those GPU's then it's extremely short. So short the bank won't give you the loan due to lack of collateral.
But this is also an insurance against the threat of an overcapacity-induced bubble: whatever capacity is built, it won't last more than a few years before becoming obsolete anyway. There's no risk that once we've finished building the railroads, or the network links, these will be "more than enough" for at least a decade.
I think the implication is the opposite, the overcapacity in case of railroads and network links became the substrate that allowed the returns after the bubble. i.e. We are still using a lot of fiber that was laid down in the 2000s and a lot of rail laid down in the early 20th century.

This time around the investments are going to evaporate and we won't get to reap the benefits of very large amounts of compute.

The possible inheritance we might get might be increased fabrication capacity for state of the art silicon.

From a societal point of view yes, it's certainly better to have already built infrastructure that might be used tomorrow than to burn money in capacity that is obsolete before ever becoming useful. From an investor's point of view though, the existence of available, completely unused capacity is disastrous because it means that prices and investments will remain close to zero until all that capacity is used. For the most obvious example: if you're investing in Nvidia, the scenario where data centers remain full of perfectly viable but completely unused GPUs for a decade is much worse than the scenario in which those GPUs were unused but you still have to build a good amount again within a few years. In the first case Nvidia has absolutely nothing to build and your shares go to zero; in the second case the company takes a hit but they keep selling new products.
I have a question, is the short lifespan of GPUs because they get worn out and are destroyed, or because they get outdated by the ever expanding demands of the AI bubble?

Because if it's the later, I would assume that growth would not continue at the same rate after the bubble bursts?

It's, from my understanding, a little bit of both. There's a failure rate of GPUs and fans. There's also changing in standards like PCIe and software stacks.

LLM inference is mainly memory bandwidth constrained so I think it's highly likely that a company will create silicon with just an insane number of memory chips and less compute. These ASICs will probably do the same thing the crypto ASICs did.

If we look back 1 decade, no one uses a GTX 950 for anything.

You'd be surprised, people are somehow buying Tesla P40s and M40s on eBay for almost $300 and $180 respectively (M40 being the same gen as GTX 950). Google Colab still offers T4s and it's taken them years to add modern GPUs. Hope they're powering them with renewables at least.

And people in general are holding on to their old machines for very long periods of time now, especially CPUs. I've had to support first gen Intel i7s at work! That's pre AVX.

Just a note, P40 came out at $5700 in 2016 dollars. In 2026 dollars that is $8000 (wow!). If you bought 100k today, assuming a 1% failure rate per year your $800M investment can be traded in for about $30M.

I think it is reasonable to assume a similar depreciation in GPUs.

Meaning you'd need to have made more than (800M - 30M) * (1 + income tax rate) + (power + maintenance).

Some say the margines on inference are already there for new GPUs but they are right margines.

Outside of training the biggest LLMs at big labs, GPU lifespan isn't as short as the OP made it out to sound. A100s are 6 years old and still a reliable work-horse, and the 80GB version hasn't depreciated that much on the used market. On the consumer side, 3090s are actually still selling for very close to 2020 MSRP.

Even the ancient V100 (soon to be 10 years old!) had somewhat of resurgence on the second-hand market, with a healthy market for interconnects in China.

If I had a datacenter and power consumption was not a concern, I'd be holding on to my A100s for years at least for inference.

Oh yeah, not meant to be all doom and gloom. Lighter workloads greatly increase hardware lifespan. And the GPUS are like at most 50% of the data-center cost I think. You get to keep the building, the cooling, the power interconnects, the networking and everything else.

Additionally the demand drives new power infrastructure, and new fabs that will definitely outlive the bubble.

As with compute hardware, someone will have a chart keeping track of "additional electricity cost per unit of compute versus state-of-the-art hardware", to determine when it's cheaper to just turn it off and replace with newer hardware.
They get worn out. Training workloads have high utilization high thermals and eventually things degrade and break.
Are there estimates of their failure rate?
From toms hardware, the figures look like 27% fail after 3 years.

https://www.tomshardware.com/pc-components/gpus/datacenter-g...