Hacker News new | ask | show | jobs
by tavianator 527 days ago
Right. Actually it turns out it's 11 bits, since [-1024, 1023] are all supported by the immediate add renamer.

In general I think people are overstating the delay of an additional 64-bit add on register file reads (though I'm not a hardware guy so maybe someone can correct me). There are carry-lookahead adders with log_2(n) == 6 gate delays. Carry-save adders might also be relevant to how they can do multiple dependent adds with 0c latency.

> And, presumably, the OP shift case here is in fact a case of there not being a built-in immediate adder and thus a need for fixup uops being inserted to materialize it?

No, the perf counters show 1 uop dispatched/retired in both the slow and fast cases.

1 comments

Ah, good to know on the uop count. Still could be (or, well, has to be to some extent) the same concept, just pipelined within one uop.