Hacker News new | ask | show | jobs
by Dylan16807 2128 days ago
Either way, it's a big problem that certain single instructions can cause this transition. When the transition is based on a usage threshold of heavy instructions, it's not so bad. And with this revision, more of the transitions are based on threshold. But there are still some instructions that cause an immediate frequency change, if I'm reading the articles right.
1 comments

No...the whole point is that the single instruction induced halt for downclocking isn't a real issue. Even in the pathological case where you insert a single instruction spaced 760 us apart in order to induce the maximum number of clock shifts, the total performance degradation due to the clock halts is only 3% (the frequency drop that is induced by the use of these instructions has a much larger impact.) Furthermore, on Icelake-SP, the halted time due to frequency transitions is supposed to go to 0, which makes this aspect of the problem entirely irrelevant.

Yes, if you insert a single 512-bit FMA that runs every so often in your code you will get a 15% performance hit from the lower frequency, but that's much less likely than the old case where people who were trying to use AVX-512 for memcpy and the like would slow down scalar code.

But they fixed the old case, by having a minimum number of heavy instructions before changing clocks. If you have some instructions there just for the occasional memcpy, it will be a little slow during the memcpy but it won't downclock and the overall impact will be very small.

Now that the older and bigger case is fixed, this case remains the last sticking point. Because you still can't trust the CPU to do the right thing when there are a small number of heavy instructions. Even if they cut the halting time to 0, it's still bad for a single instruction to cause a prolonged downclock.