Hacker News new | ask | show | jobs
by goosehonk 2393 days ago
It's a good thing you measured it :-) Programs that do a little bit of 512x512 FMA mixed in with other stuff will not benefit from AVX-512 but can suffer from the heat it generates, or from the hiccup when the CPU turns the FMA unit on and back off.

Codes that can do a lot of 512b FMA consecutively will benefit very greatly, and pay a small penalty (up to 25%) in terms of throughput for everything else.

Codes that use non-multiplier stuff that's just marketed as AVX-512, like VBMI2, also benefit greatly and without any penalty.

People with AMD CPUs don't get a choice. Hard to see how this accrues to Intel's mistakes column.

1 comments

It's not really an Intel mistake, but it is an Intel problem. In ML, the ASICs are coming. NVIDIA is pretty much guaranteed to maintain a leadership position in this space because their software layers are dominant. Intel's ML leadership position is quite tenuous because the killer ML features don't work quite well enough for the premium. MKL should be a solid moat, similar to NVIDIA's CUDA and CUDNN, but if it requires serious effort to get the benefits, it becomes more palatable to spend that effort on ARM-based servers or custom hardware like Inferentia which are meaningfully cheaper. Maybe Ice Lake will fix this, but Intel is running out of time to convince people that Intel chips should remain the first choice in ML.

AMD isn't relevant in this space AFAIK.