| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lostmsu 1357 days ago
	2 TFlops or 5 TFlops does not matter much. 3090Ti does 160 TFlops, e.g. at least 30x (!) times faster.

1 comments

bluescarni 1357 days ago

Those are Tensor flops, the numbers for the Zen CPU are "general-purpose" flops (sometimes called "vector flops" in marketing material).

The vector flops for the 3090Ti are 33 TFlops for single precision, 0.5 TFlops for double precision. So, 16x faster than the 5950x in single precision, 2x slower for double precision. At almost 3x the price and >4x the power consumption.

Of course, if all you care about is AI, then there's no argument - but then we are not really talking about a general-purpose device any more.

The narrative of GPUs being "hundreds of time" faster than CPUs is vastly blown out of proportion for general-purpose computing.

link

lostmsu 1357 days ago

I think you missed that this whole discussion is in the context of deep learning, therefore your comment does not apply. It is 30x slower that 3090Ti for that purpose.

link

bluescarni 1356 days ago

My initial comment was correcting a factually inaccurate statement regarding CPU performance.

It is you who barged into the thread with unrelated GPU performance numbers, but whatever :)

link

lostmsu 1356 days ago

You are missing forest for the trees.

Here's the comment I assume you are allegedly trying to "correct":

> with full training you are out of luck with CPUs, the gap is much bigger. 64c TR could only get to roughly 1TFlops

1TFlops is not the main part of that statement, and it is qualified with "roughly" which I suppose is not too far from the truth in the context. And the context is "training ... the gap is much bigger", and in this case "much" is at least 30x even with the updated number.

link