Hacker News new | ask | show | jobs
by artaak 2305 days ago
Could you cite an example where ECC requirement on GPU was real and demonstrated to be needed? In practice, I don't know anyone who'd willfully take 10-15% perf hit on GPUs, because of a cosmic ray.

The thermal design for "datacenter" card can be better for sure. And on-board memory size and design. That's about it. For how many x over geforce price tag is that?

1 comments

"In practice, I don't know anyone who'd willfully take 10-15% perf hit on GPUs, because of a cosmic ray."

Virtually every server in data centers runs on ECC: the notion of not using it is simply absurd. And given that the Tesla V100 gets 900GB/s of memory bandwidth with ECC, versus 616GB/s of memory bandwidth on the 2080Ti without ECC, it's a strawman to begin with.

nvidia further states that there is zero performance penalty for ECC.

As to whether the requirement is "real", Google did an analysis where they found their ECC memory corrected a bit error every 14 to 40 hours per gigabit.

"That's about it."

Also ECC memory. Also dramatically higher double precision performance. Dramatically higher tensor performance. Aside from all of that...that's it.