Hacker News new | ask | show | jobs
by daemonwrangler 4000 days ago
Something else to keep in mind is that you can get significant power savings when you lower the clock rate. So if you measure total power consumed to run a calculation, it may actually be more efficient to run on a fast CPU, finish quickly, and then drop into a low power state than it would be to run it on a low performance CPU for significantly longer.
3 comments

This is the sort of factor that people forgot to include when testing SSDs for power/performance metrics in the early days of them being within reach of the average home users. An SSD (especially some of the older models) can pull more power than a good spinning metal drive when running as full force, but what some people didn't factor in was that the SDDs did more in a given time especially with latency sensitive workloads - so to do the same work as the traditional drive it would need to run at run pelt for far less time meaning quite a saving in power.

Another thing modern CPUs do as well as slowing down when under light load is to almost turn parts of themselves off when not needed. These are things that any CPU could potentially do though, it isn't a difference between CISC and RISC designs.

I can't find a good reference now, but supposedly the i7 has a set of transistors that calculates if its workload would execute faster on multiple cores, or fewer cores, and can park cores to save heat, and let the electricity be focused into the unparked cores.

Intel's marketing material in 2008 mentioned the number of transistors doing the load calculations was about equal to the number of transistors in a 486. So you have a 486 constantly determining thread scheduling load, they claimed.

You misunderstood. The CPU doesn't get to decide how many cores are used; the operating system's scheduler does. The CPU just tries to keep an accurate running estimate of its power consumption and uses that to predict whether it has enough headroom to boost the clock speed above the nominal full speed. If some cores are temporarily idled by the OS, then that frees up a lot of power and allows the remaining cores to have their clock speed boosted further.
Intel's marketing materials helped me misunderstand. Unless the OS is leveraging that logic when it calculates which CPU to park.

Does any OS know that unparked CPU clock speeds might increase when they park a CPU?

The operating systems have plenty of knowledge about how CPU power management works. They are hampered somewhat by how things like Turbo Boost are implemented in a backwards-compatible way through ACPI P-states that can't directly convey this information, but it's still pretty straightforward for an OS to support even more complicated schemes like ARM's big.LITTLE.

The real problem is that the OS seldom has enough information about the software workload to know whether it is better run on all cores, or just a few at higher clocks. It falls to application developers to not spawn more worker threads than are necessary.

That may work under a synthetic workload where you know the beginning and end of the "heavy" load.

But i don't know if it holds up in real life scenarios, in particular on multitasking platforms.