| Of interest regarding this might be: https://twitter.com/InstLatX64/status/934093081514831872 > The sad thing is there is no CPUID flag to distinguish good AVX512 from useless AVX512. You can read the the avx512_2ndFMA bit from the PIROM, according to this Intel datasheet: https://www.intel.com/content/www/us/en/processors/xeon/scal... Linux doesn't implement reading PIROM over SMBus, but it sure would be nice to expose this flag in /proc/cpuinfo. In WireGuard we're at the moment just disabling the zmm AVX512F implementation on Skylake-X, falling back to the still-fast-but-not-as-fast AVX512VL implementation that only touches ymm and doesn't downclock as much (following OpenSSL's reasoning on +/- Andy Polyakov's same implementation): https://git.zx2c4.com/WireGuard/tree/src/crypto/chacha20poly... I may look into trying to read the PIROM so that I can make a more informed decision. I've tested those Platinum boxes, and indeed it's a lot faster there, even with the [lesser] downclocking, whereas a Gold box didn't perform as well, making the ymm-only implementation necessary. |