Hacker News new | ask | show | jobs
by w1nk 1497 days ago
Actually there's reasonable evidence to believe this isn't true. This paper from google: https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s... outlines specific scenarios where CPUs fail over time. Given the evidence that these are silicon defects that are actually worsening over time, there's no reason to imagine these failures don't extend to GPUs as well.

The difference in data here is obviously scale, google has -way- more CPUs than GPUs so the absolute counts of failures will be different.

1 comments

I'll concede that silicon can wear out over time; it's impossible for it to be immune from happening, and I wasn't speaking in absolutes. But as you mention, it's one of scale. I'm curious how likely a second hand GPU from a cryptominer is affected.
So that's actually one of the awful implications of this paper. It's probably actually happening at a rate higher than would be noticed by humans.

If a given piece of silicon is hosing up a GEMM (matrix multiply), in graphics scenarios this may be invisible to the human eye as it could potentially just introduce artifacts in a scene rendering that could be entirely ephemeral to the frame.

In the case of crypto mining though, it's completely possible (probable?) that there are GPUs that can't possibly ever calculate a proper SHA3 hash (see the paper on AES instructions that fail in symmetric ways).