|
|
|
|
|
by solidasparagus
2383 days ago
|
|
No, when we used MKL, the workload was slower and turning off MKL made the workload faster. The marketing is irrelevant - using vectorized instructions slowed down the workload in practice which is all that really matters. The Intel teams we were working with explained it as being due to the slower clock speeds caused by vectorized instructions. I don't really know, but it seems fair to assume that they do. It will be interesting to test Ice Lake when they make it to the cloud, hopefully some time late next year, but until we can actually use Ice Lake, Sky Lake is what AVX512 will be judged on. |
|
Codes that can do a lot of 512b FMA consecutively will benefit very greatly, and pay a small penalty (up to 25%) in terms of throughput for everything else.
Codes that use non-multiplier stuff that's just marketed as AVX-512, like VBMI2, also benefit greatly and without any penalty.
People with AMD CPUs don't get a choice. Hard to see how this accrues to Intel's mistakes column.