My 5950x (measured) flops are ~2 TFLOPS in single-precision, ~1TFLOPS in double precision (obviously, due to half the SIMD vector size). This is a desktop-class 16-core machine.
I've tried it on 10980XE (18-core) that got between 600GFlops-1.6TFlops depending on the instruction in quad channel mode. Will try later on a 32-core Threadripper. The challenge there is to keep all cores busy during training while not repeating the same gradient computation I guess (both scheduling and memory stuff).
Those are Tensor flops, the numbers for the Zen CPU are "general-purpose" flops (sometimes called "vector flops" in marketing material).
The vector flops for the 3090Ti are 33 TFlops for single precision, 0.5 TFlops for double precision. So, 16x faster than the 5950x in single precision, 2x slower for double precision. At almost 3x the price and >4x the power consumption.
Of course, if all you care about is AI, then there's no argument - but then we are not really talking about a general-purpose device any more.
The narrative of GPUs being "hundreds of time" faster than CPUs is vastly blown out of proportion for general-purpose computing.
I think you missed that this whole discussion is in the context of deep learning, therefore your comment does not apply. It is 30x slower that 3090Ti for that purpose.
https://old.reddit.com/r/Amd/comments/9uswbz/how_much_gflops...