Hacker News new | ask | show | jobs
by frognumber 361 days ago
> Another possibility that has long been on my personal list of “future articles to write” is that the future of computing may look more like used cars. If there is little meaningful difference between a chip manufactured in 2035 and a chip from 2065, then buying a still-functional 30-year-old computer may be a much better deal than it is today. If there is less of a need to buy a new computer every few years, then investing a larger amount upfront may make sense – buying a $10,000 computer rather than a $1,000 computer, and just keeping it for much longer or reselling it later for an upgraded model.

This seems improbable.

50-year-old technology works because 50 years ago, transistors were micron-scale.

Nanometer-scale nodes wear out much more quickly. Modern GPUs have a rated lifespan in the 3-7 year range, depending on usage.

One of my concerns is we're reaching a point where the loss of a fab due to a crisis -- war, natural disaster, etc. -- may cause systemic collapse. You can plot lifespan of chips versus time to bring a new fab online. Those lines are just around the crossing point; modern electronics would start to fail before we could produce more.

6 comments

> Modern GPUs have a rated lifespan in the 3-7 year range, depending on usage.

That statement absolutely needs a source. Is "usage" 100% load 24/7? What is the failure rate after 7 years? Are the failures unrepairable, i.e. not just a broken fan?

I’ve never heard of this and I was an Ethereum miner. We pushed the cards as hard as they would go and they seemed fine after. As long as the fan was still going they were good.
So Intel used to claim a 100,000+ hour life time on their chips. They didnt actually test them to this because that is 11.4 years. But it was basically saying, these things will last at full speed way beyond any reasonable life time. Many chip could probably go way beyond that.

I think it was about 15 years back they stopped saying that. Once we passed the 28nm mark it started to become apparent that they couldnt really state that.

It makes sense, as parts get smaller they will get more fragile from general usage.

With your GPUs yeah they are probably still fine but they could already be half way through their life time, you wouldnt know it until failure point. Add in the silicon lotto and it gets more complicated.

One thing to realize is the lifetime is a statistical thing.

I design chips in modern tech nodes (currently using 2nm). What we get feom the fab is a statistical model of device failure modes. Aging is one of them. When transistors gradually age they get slower sue to increased threshold voltage. This eventually causes failure at a point where timing is tight. When will it happen varies greatly sue to initial conditions, exact conditions the chip was in(temp, vdd, number of on-off cycles, even the workload). After an agong failure the chip will still work if the clock freq is reduced. There are aging monitors on-chip sometimes which try to catch it early and scale down the clock.

There are catastrophic failures too, like gate insulator breakdown, electromigration or mechanical failures of IO interconnect. The last one is orders of magnitude more likely than anything else these days.

For mining, If a GPU was failing in such a way that it was giving completely wrong output for functions during mining, that'd only be visible as a lower success hash-rate which you might not even notice unless you did periodic testing of known-target hashes.

For graphics, the same defect could be severe enough to completely render the GPU useless.

Yeah, chip aging is at max temperature, max current, and worst process corner. And it's nonlinear so running at <10% duty cycle could reduce aging to almost nothing.
Has chip aging finally surpassed thermal cycling as the primary cause of component failure in datacenters?
I don't know but I would guess not. Solder is really weak compared to silicon.
Every now and then, I get a heartfelt chuckle from HN.

By 'Modern' they must mean latest generation, so we'll have to wait and see. I was imagining not using an RTX 5090 for 7 years and find it doesn't work, or one used 24x7 for 3 years then failing.

Electromigration and device aging are huge issues. I can't imagine a modern GPU having a lifetime longer than 3 years at 100C temperature.

Though, it can be solved with redundancy at the cost of performance.

Just look at warranties, gotta go to Quadro series for industrial warranty lengths.
> Nanometer-scale nodes wear out much more quickly. Modern GPUs have a rated lifespan in the 3-7 year range, depending on usage.

I recently bought a new MacBook, my previous one having lasted me for over 10 years. The big thing that pushed me to finally upgrade wasn’t hardware (which as far as I could tell had no major issues), it was the fact that it couldn’t run latest macOS, and software support for the old version it could run was increasingly going away.

The battery and keyboard had been replaced, but (AFAIK) the logic board was still the original

> it couldn’t run latest macOS, and software support for the old version it could run was increasingly going away.

which is very annoying, as none of the newer OS versions has anything that warrants dumping hardware to buy brand new to run them with! With the exception of security upgrades, which i find dubious for a company to stop creating (as they would need to do so for their newer OS versions just as well, so the cost of maintaining security patches ought to not be much, if at all), it is definitely more likely to be a dark-pattern to force hardware upgrades.

That's not just a dark pattern, it's the logical conclusion to Apple's entire business model. It's what you get for relying on the proprietary OS supplied by a hardware manufacturer. It's why Asahi Linux is so important.
I'm not sure I agree. Open source software also regularly drops support for old hardware and OSes.
"regularly" is doing a lot of work here. When Linux drops hardware support, we are talking about ancient hardware. An example of a regular drop: Linux 6.15 just a month ago dropped support for 486 (from 1989)!
That's surprising. What is the 486 missing that Linux needs? Or is it that there are no volunteers to test and maintain Linux on a 486 (as often happens with older architectures)?
Pretty much, you can still get modern distros that support 32bit PowerPC.
Open source software drops hardware support only when there are nobody left who volunteers to support that hardware. When does this happen? It happens when there are not enough users left of that hardware.

As long as there are enough users of some hardware, free software will support it, because the users of that hardware want it to.

Is "regularly" every 2-4 years, or longer? What are your options? With Apple you have none. It's really not a comparable situation.
And then he still couldn’t use the third party software he says he depends on…
Depending on how much has changed in the interval, backporting security fixes can be completely trivial, very difficult, or anywhere in between. There may not even be a fix to backport, as not all vulnerabilities are still present in the latest release.
You mean besides the fact that they completely transitioned to a new processors and some of the new features use hardware that is only available on their ARM chips?

Also he said that software from third parties also don’t support the older OS so even if Apple did provide security updates, he would still be in the same place.

I've got 3 Macbooks from 2008, 2012, and 2013. Apple dropped MacOS support years ago. They all run the latest Fedora Linux version with no problems.

The screen on the MacBookPro10,2 is 2560x1600 which is still higher than a lot of brand new laptops. The latest version it will run is 10.15 from 2019. I know Apple switched to ARM but most people don't need a new faster computer. I stopped buying Apple computers because I want my computer supported more than 6 years.

I do have 3 newer computers but these old Macbooks are kept at various relative's houses for when I visit and wnat my own Linux machine. They have no problems running a web browser and watching videos so why replace them?

> Modern GPUs have a rated lifespan in the 3-7 year range, depending on usage.

I seriously doubt this is true. The venerable GTX 1060 came out 9 years ago, and still sees fairly widespread use based on the Steam hardware survey. According to you, many (most?) of those cards should have given out years ago.

This is just untrue, and you’ve provided no citation, either.

The silicon gates in GPUs just don’t wear out like that, not at that timescale. The only thing that sort of does is SSDs (and that’s a write limit, which has existed for decades, not a new thing).

3 second with a web search would bring up citations:

https://duckduckgo.com/?t=ffab&q=gpu+lifespan&ia=web

You'll need to scroll past the ones talking about obsolescence versus failure. Toss in 'data center' if you like.

You'll see a range of numbers -- including some lower than I cited -- but it's all in that ballpark.

Electromigration tends to get worse with small sizes but also higher voltage and temperatures. I could see a GPU wearing out that quickly if it were overclocked enough, but stock consumer GPUs will last much longer than that.
electromigration is real, but is it relevant?

since electromigration is basically a matter of long, high-current interconnect, I guess I have been assuming it's merely designed around. By, for instance, having hundreds of power and ground pins, implying quite a robust on-chip distribution mesh, rather than a few high-current runs.

Wouldnt it depend on work loads? My GPU that kicks into high gear for maybe 2-3 hours a week will probably do decades of use before chip degradation kicks in. The power capacitors will give out long before the silicon does.

But it someone is running an LLM 24 hours a day, might not go for as long.

We are flying blind, both on those claiming short life span and those who are not.

you're thinking of what, electromigration?

what is the age-related failure mode you're referring to?

or are you merely referring to warranty period? (which has more to do with support costs, like firmware - not expected failures.)

> One of my concerns is we're reaching a point where the loss of a fab due to a crisis -- war, natural disaster, etc. -- may cause systemic collapse.

This is absolutely ridiculous. Even if Taiwan sank today we really don't need those fabs for anything critical. i strongly suspect we could operate the entire supply chain actually necessary for human life with just z80s or some equivalent.