Hacker News new | ask | show | jobs
by jcranmer 2852 days ago
> For example, on page 11 it says "Note that when the lower 32-bit eax portion of the 64-bit rax register is set, the upper 32-bits are unaffected." In reality, the high order bits are zeroed to avoid a data dependency.

Well, there is one case where the upper 32-bits are not zeroed. It turns out that xor eax, eax is assigned to opcode 0x90, which is better known to most people as NOP.

If you want real fun, read up on what happens with AVX registers. Whether or not you leave untouched or zero the upper bits are dependent on if you use VEX encoding or not.

2 comments

No, in section 3.4.1.1 General-Purpose Registers in 64-Bit Mode of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, it says, "32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register."

`xor eax, eax` actually generates 0x31 0xc0, and `xor rax, rax` generates 0x48 0x31 0xc0. 0x90 decodes to xchg eax, eax in all modes except long mode, which has no effect. In long mode, the opcode 0x90 has no effect still but is no longer equal to xchg eax, eax.

Indeed xchg eax,eax is a nop idiom; there are many. Recent microarchitectures simply ignore nops without executing anything. From the Intel® 64 and IA-32 Architectures Optimization Reference Manual:

16.2.2.6 NOP Idioms

NOP instruction is often used for padding or alignment purposes. The Goldmont and later microarchitecture has hardware support for NOP handling by marking the NOP as completed without allocating it into the reservation station. This saves execution resources and bandwidth. Retirement resource is still needed for the eliminated NOP.

This nop idiom is very special however, since it isn't just about efficiency: if it wasn't a nop idiom, xchg eax, eax would not be a nop at all, it would clear the upper 32 bits, as xchg ebx, ebx does (or any other register other than eax).
> It turns out that xor eax, eax is assigned to opcode 0x90, which is better known to most people as NOP.

This can't be true, since xoring a register with itself zeroes that register, and zeroing a register can't possibly be a general NOP instruction.

XOR also sets flags, another thing NOPs can't do.

That's because it's actually xchg eax, eax not xor eax, eax.
xor eax,eax is a zero (or dependency breaking) idiom. Zero idioms are detected and removed by the renamer. So they have no execution latency.