|
|
|
|
|
by tavianator
527 days ago
|
|
Right. Actually it turns out it's 11 bits, since [-1024, 1023] are all supported by the immediate add renamer. In general I think people are overstating the delay of an additional 64-bit add on register file reads (though I'm not a hardware guy so maybe someone can correct me). There are carry-lookahead adders with log_2(n) == 6 gate delays. Carry-save adders might also be relevant to how they can do multiple dependent adds with 0c latency. > And, presumably, the OP shift case here is in fact a case of there not being a built-in immediate adder and thus a need for fixup uops being inserted to materialize it? No, the perf counters show 1 uop dispatched/retired in both the slow and fast cases. |
|