|
|
|
|
|
by tarlinian
2129 days ago
|
|
No...the whole point is that the single instruction induced halt for downclocking isn't a real issue. Even in the pathological case where you insert a single instruction spaced 760 us apart in order to induce the maximum number of clock shifts, the total performance degradation due to the clock halts is only 3% (the frequency drop that is induced by the use of these instructions has a much larger impact.) Furthermore, on Icelake-SP, the halted time due to frequency transitions is supposed to go to 0, which makes this aspect of the problem entirely irrelevant. Yes, if you insert a single 512-bit FMA that runs every so often in your code you will get a 15% performance hit from the lower frequency, but that's much less likely than the old case where people who were trying to use AVX-512 for memcpy and the like would slow down scalar code. |
|
Now that the older and bigger case is fixed, this case remains the last sticking point. Because you still can't trust the CPU to do the right thing when there are a small number of heavy instructions. Even if they cut the halting time to 0, it's still bad for a single instruction to cause a prolonged downclock.