|
|
|
|
|
by jleahy
1534 days ago
|
|
On a modern x86 cpu the ‘xchg’ instruction performs a swap and can do so entirely in the front-end via register renaming. It doesn’t even require a micro-op.
This ‘trick’ actually requires executing micro-ops and creates an unnecessary data dependency (which is actually worse than the micro-ops themselves). Better to just use more variables and let the register allocator in the compiler decide what to do. If it’s a loop then unrolling it once could remove the need for any swapping at all, for example. |
|
This is only true for AMD cpus, on Intel xchg is 3 uops. Still better than the xor trick, though.
Source: https://www.uops.info/html-instr/XCHG_R64_R64.html