Hacker News new | ask | show | jobs
by themafia 62 days ago
XOR and SUB have had identical cycle counts and latencies since the 8088. That's because you can "look ahead" when doing carries in binary. It's just a matter of how much floorspace on the chip you want to use.

https://en.wikipedia.org/wiki/Carry-lookahead_adder

The only minor difference between the two on x86, really, is SUB sets OF and CF according to the result while XOR always clears them.

2 comments

A carry lookahead adder makes your circuit depth logarithmic in the width of the inputs vs linear for a ripple carry adder, but that is still asymptotically worse than XORs constant depth.

(But this does not discount the fact that basically all CPUs treat them both as one cycle)

OF/CF/AF are always cleared anyway by SUB r,r. So there's absolutely no difference.
The point is OF/CF are sometimes dependent on the inputs for SUB. They never are for XOR.
Ah, you mean in terms of complexity of the calculation. Thanks for clarifying.

In practice AF and CF can be computed from the carry out vector which is already available, and OF is a single XOR (of the two most significant bits of the carry out vector). The same circuitry works for XOR and SUB if the carry out vector of XOR is simply all zeroes.

It also clears any dependence on the state of those flags. Which is probably not useful in practice.