|
|
|
|
|
by fyp
2240 days ago
|
|
Isn't that the wrong direction for the optimization? I would assume you would want to compile adding two numbers into shifting by one, not the other way around. (I know nothing about hardware, it just intuitively seems like moving a bunch of bits over by 1 should be faster than dealing with xor and carries) |
|
To put this in more concrete terms: an N-bit adder involves N 1-bit stages to add each bit, and then a 1-bit carry network on top of that, which has N stages in it. So overall, it's O(N) in terms of hardware. An N-bit shift unit is going to use lg N N-bit muxes--or O(N lg N) in terms of hardware. Total gate delay in both cases is O(lg N), but adders have O(N) hardware (and thus energy consumption) while shifters have O(N lg N).
A secondary consequence of being larger area is that a superscalar architecture may choose to have one execution unit that has an adder and a shifter and a second that only has the adder. So an addition may schedule better than a shift, since there are more things it can execute on.