|
|
|
|
|
by pbsd
683 days ago
|
|
A credible but unconfirmed rumor I've read is that Intel didn't want to do it because of the 512-bit shuffles. The E-cores (including those using the upcoming Skymont microarchitecture) are natively 128-bit and already double-pump AVX2, so to quad-pump AVX-512 would result in many-uops 512-bit shuffle instructions made out of 128-bit uops. This would render the shuffles unusable, because you'd unpredictably have them costing either 1 uop or cycle to taking 10-30 uops/cycles depending on which core you are on at the moment. A situation similar to PEXT/PDEP, which cost almost nothing on Intel and dozens of cycles on AMD until a couple generations ago. Why does Zen 4 not have this problem? First, they're only double-pumping instead of quad-pumping. Secondly, while most of their AVX-512 implementation is double-pumped, there seems to be a full-length shuffle unit in there. |
|