| > Uh... yes you do? How else do you think it works? No, I literally explain it in my first answer.
The part about "1590 decoders" is irrelevant since a misunderstood your message (thinking that you are talking about using 16 decoders to decode the 16 instruction lengths of a single instruction). But the rest on instruction length decode is how you actually do it. > I'm saying that it isn't remotely an intractable power problem. I mean, obviously, if you ignore all the power consumption issues of using 32 decoders in parallel and using only 5 of the results out of the 32. Then yes, there's no problem. But in reality, yes it's a problem to decode many x86 instructions in parallel. > Just draw it out: check the gates required for a 64->128 Dadda multiplier or 256 bit SIMD operation and compare with what you'd need here. It's noise. Yes, the energy consumption of the multipliers is high, but I don't see how this is an argument to make an inefficient decoder?
Also, a multiplier power consumption depends on transistor activity, and you can expect the MSB of the operand not to change too much. For decoder the transistor activity will be high. > And your citation of "8 instructions in parallel" seems suspicious. Did I just get trolled into a Apple vs. x86 flame war? Not a troll nor a flame war. I don't use Apple products, mainly because I don't agree with Apple practices.
But actually choosing a RISC ISA allows them to decode a lot of instructions in parallel for little energy and complexity. I chose 8 because it is the maximum that the mainstream will currently see.
You might argue that 8 RISC instructions are not comparable with 8 CISC instructions, but even with say 4 CISC instructions it will still consume more energy |
Alder Lake decodes six. And again, your intuition about power costs here is just simply wrong. Instruction decode is Simply Not a major part of the power budget of a modern x86 CPU. It's not.