Hacker News new | ask | show | jobs
by pixelesque 868 days ago
Yeah, I haven't checked within the last few years on more recent Intel/AMD processors, but it used to be that on Intel CPUs, only port 5 could be used for shuffles, so it was possible to bottleneck them on code with fairly heavy usage of shuffles.
1 comments

It's better now that Ice Lake+ can do some shuffles and unpack operations on two ports, but bottlenecking on the shuffle ports can still be a problem.