Hacker News new | ask | show | jobs
by aseipp 3401 days ago
Thanks for the correction. And FWIW this is an insanely good question and it's hard for me to immediately answer! The things I want AVX512 for are very fat SIMD registers for my cryptographic code, and I've only lightly kept up with it since AVX-512 was pushed off to Skylake-Xeon only. So I haven't worried about highly heterogenous workloads (in my head).

I'd have to look up the specifics; but does AVX512 simply slow the clock, or does it actually have some kind of limited number of hardware ports? I wonder if some clock slowdown would be very much of an issue, since clock-for-clock, you should see better performance on Skylake anyway.

Just curious, what kind of workloads do you think you're looking at here?

1 comments

I'm having trouble finding good references now, but I am sure I remember reading that is was simply slowing the clock.

In my case, yes, Skylake would still be a win over older hardware, but the question is whether to use AVX512 or not. The workload is a real time animation system with a bunch of nodes in a graph that get evaluated in sequence. Some nodes would benefit from AVX512, but others would not. So the question is, if we vectorize those nodes that would benefit and get a speedup there, will the other unvectorized nodes now run slower as a result of the lower clock speed, canceling out the benefit.

It sounds like your case is a much better fit for AVX512. Out of curiosity, have you tried running on Xeon Phi, which also supports AVX512?