|
|
|
|
|
by anarazel
487 days ago
|
|
It performs terrible if you have an intermittent workload. Like e.g. a request response workload where request processing is cheap (so that the time to increase the frequency matters). I've seen cases it's a more than 2x request latency increase. It can be pretty annoying, because it means that systems can perform better under higher load and that you get drastically different latency depending on whether a request is scheduled on a core that just processed another request (already at high freq) or one that was idle. And because the frequency control isn't fun enough, this behavior also exists with cpu idle states. Even at high frequency Linux can enter idle states... I've debugged several cases where this set of issues has caused unintuitive behavior. E.g. a) switching to a more powerful servers drastically increased latency b) optimized code resulting in higher latency / lower throughout because that provided enough idle cycles for a deeper idle time between requests c) slightly increased IO latency leading to significantly worse overall performance, due to the IO getting long though to clock down |
|
Actually, thinking this through, even then it doesn't make much sense to me: if you have that many short requests coming in, the CPU would simply never scale back if it's reasonably constant. It would first need to see some gap, and why not scale the CPU back in that gap (at the cost of having the 1st request of the next batch be a few milliseconds slower)? From there on, every subsequent request is fast again until there's another lull. Keeping the CPU always on high frequency should only be needed if you have a very tight deadline on that surprise request (high-frequency trading perhaps?), or if your requests are coincidentally always spaced by the same amount of time as CPU scaling measures across. I'm sure these things exist but "intermittent workload" is 90% of all workloads and most workloads definitely aren't meaningfully impacted by cpu scaling