|
|
|
|
|
by wolfgke
3412 days ago
|
|
When I look at the generated code that multiple of the compilers (gcc,clang,icc) generate mov eax, edi
mov edi, esi
mov esi, eax
I would intuitively use the `xchg` instruction that x86-32/x86-64 provides instead. Is there a specific reason why the compiler(s) decide to generate the mentioned code instead? |
|
"Instructions with a LOCK prefix have a long latency that depends on cache organization and possibly RAM speed. If there are multiple processors or cores or direct memory access (DMA) devices then all locked instructions will lock a cache line for exclusive access, which may involve RAM access. A LOCK prefix typically costs more than a hundred clock cycles, even on single-processor systems. This also applies to the XCHG instruction with a memory operand."