|
|
|
|
|
by robocat
2172 days ago
|
|
The AVX512 instructions can cause strange global performance downgrades. “One challenge with AVX-512 is that it can actually _slow down_ your code. It's so power hungry that if you're using it on more than one core it almost immediately incurs significant throttling. Now, if everything you're doing is 512 bits at a time, you're still winning. But if you're interleaving scalar and vector arithmetic, the drop in clock speeds could slow down the scalar code quite substantially.“ - 3JPLW and https://blog.cloudflare.com/on-the-dangers-of-intels-frequen... The processor does not immediately downclock when encountering heavy AVX512 instructions: it will first execute these instructions with reduced performance (say 4x slower) and only when there are many of them will the processor change its frequency. Light 512-bit instructions will move the core to a slightly lower clock. * Downclocking is per core and for a short time after you have used particular instructions (e.g., ~2ms). * The downclocking of a core is based on: the current license level of that core, and also the total number of active cores on the same CPU socket (irrespective of the license level of the other cores). As per https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-us... |
|
Can other SIMD instructions (AVX2, say) do the same?