| I remember a lot of code zeroing registrers, dating at least back from the IBM PC XT days (before the 80286). If you decode the instruction, it makes sense to use XOR: - mov ax, 0 - needs 4 bytes (66 b8 00 00)
- xor ax,ax - needs 3 bytes (66 31 c0) This extra byte in a machine with less than 1 Megabyte of memory did id matter. In 386 processors it was also
- mov eax,0 - needs 5 bytes (b8 00 00 00 00)
- xor eax,eax - needs 2 bytes (31 c0) Here Intel made the decision to use only 2 bytes. I bet this helps both the instruction decoder and (of course) saves more memory than the old 8086 instruction. |
Never mind the fact that, as the author also mentions, the xor idiom takes essentially zero cycles to execute because nothing actually happens besides assigning a new pre-zeroed physical register to the logical register name early on in the pipeline, after which the instruction is retired.