Hacker News new | ask | show | jobs
by twtw 2655 days ago
> That's... not the way hardware works.

Regarding transistor / area budget, sure.

Regarding power budget, that absolutely could be the way hardware works. Modern chips can and do power off pieces of silicon when they are not needed. Whether that would be worth it for instruction decoding, I don't know, but it could be.

1 comments

Powering off pieces of silicon when they are not needed is done through "clock gating", where you stop feeding a clock to a block that is not needed.

That is only possible when you deal with isolated parts. You cannot, for example, power down an instruction decoders ability to understand different syntaxes, but only power down the entire instruction decoder. Trying to design it so that sub-features of a block like that can be powered down would not be productive.

A realistic clock-gating would be something like powering down the actual execution units ("We don't need AVX-512, so lets not waste power on the execution units"), but that doesn't help in saving power wasted on legacy.

You can absolutely design the instruction decoder into two parallel decoders that decode AArch32 and AArch64 respectively. Splitting Thumb from the rest of AArch32 probably doesn't make sense, and on x86 it probably doesn't make sense to break out 32 and 64 bit, but I can absolutely see the case for AArch32 vs AArch64.
> You can absolutely design the instruction decoder into two parallel decoders that decode AArch32 and AArch64 respectively.

You can design anything. The question is whether the added design complexity (which for silicon directly translates to increased power consumption) outweighs the benefits.

Thus

>> Whether that would be worth it for instruction decoding, I don't know

There would be significant overhead to design a decoder such that it could switch between legacy and aarch64 only, but it could conceivably be done.

fyi clock gating isn't the same as power gating.

> There would be significant overhead to design a decoder such that it could switch between legacy and aarch64 only, but it could conceivably be done.

What you'd do then is to split the decoder into several blocks, so that there's a fan-out from a main decoder into the different sub-decoders, and then power down the sub-decoders. It's still entire blocks you power down.

Plus, I think the increased power consumption from this design (especially considering that the decoder now needs to stall on powered down sub-decoders) will outweigh the savings of powering down any sub-decoders.

> fyi clock gating isn't the same as power gating.

Of course not. Both clock gating and power gating are power saving designs. Clock gating and power gating both eliminate switching current entirely, while power gating also removes leakage current at the cost of larger architectural changes than those required by clock gating.

I'm out on a limb here, but I don't think power gating makes much sense outside extreme low-power devices.