Hacker News new | ask | show | jobs
by ascar 1479 days ago
Do you have a reference for that? I googled and I couldn't find anything good.

While it might sound intuitive that SIMD instructions consume more power, I don't think that's necessarily true to a relevant degree in practice. My understanding is CPU power consumption is mostly tied to inefficiences that cause energy loss via heat, while the actual computation doesn't consume any energy per se. So electrons traveling a more complex path probably cause somehwat more energy loss as there is more wire/transistors to pass. But most of the total loss doesn't actually occure in the ALU. Empirically from what you can see operating systems do, the most effective way of consuming less power is actually running on a slower clock cycle and the most effective way to achieve that is getting work done faster and that's not tied to the number of instructions.

The Stackoverflow question here [1] seems to suggest that SIMD vs no SIMD has a neglectable overhead compared to entering a lower power state sooner.

[1] https://stackoverflow.com/questions/19722950/do-sse-instruct...

1 comments

I think in summary what you are alluding to is instruction decoding and scheduling as the sibling comment points out, which is indeed a large cost in both speed and power.

> SIMD vs no SIMD has a neglectable overhead compared to entering a lower power state sooner.

Yes on the same CPU as I suggested, real world difference may be unmeasurable. However note that this particular case is interesting because it's not comparing fewer serial multiplies to more SIMD multiplies, it's comparing SIMD multiplies to no multiplies but with a serial constraint due to variable dependence... i.e it's SIMD vs no multiply without any other difference in number or type of ALU ops... which again could make no difference on a big x86 in practice, but it would be interesting to know.

All of this changes if you are coding for a lower power device and have choice of hardware.