Switch in place is inefficient for data stored in memory, like a matrix.
The most efficient way to swap 2 values stored in memory is to use 2 load instructions and 2 store instructions, like "load X in R1; load Y in R2; store R1 in Y; store R2 in X".
Therefore, for swapping memory values the XOR trick has never been useful, in the entire history of automatic computers.
For swapping data that is stored in internal CPU vector registers or matrix registers there are special shuffle instructions, which implement various kinds of transpositions.
Switch in place was efficient mostly in the distant past, for swapping general-purpose registers in those CPUs that did not have dedicated exchange/swap instructions. Intel/AMD CPUs always had exchange instructions, so switch in place has never been useful on them in any circumstances, since the very launch of the IBM PC, 45 years ago.
Today, the XOR trick might have remained useful for swapping general-purpose registers in some microcontrollers, but in the most popular ISAs, like ARM-based or RISC-V, most GPRs are equivalent, so the need to swap them arises very rarely, only in certain kinds of loops, and even there swapping can frequently be avoided by unrolling the loops.
The most efficient way to swap 2 values stored in memory is to use 2 load instructions and 2 store instructions, like "load X in R1; load Y in R2; store R1 in Y; store R2 in X".
Therefore, for swapping memory values the XOR trick has never been useful, in the entire history of automatic computers.
For swapping data that is stored in internal CPU vector registers or matrix registers there are special shuffle instructions, which implement various kinds of transpositions.
Switch in place was efficient mostly in the distant past, for swapping general-purpose registers in those CPUs that did not have dedicated exchange/swap instructions. Intel/AMD CPUs always had exchange instructions, so switch in place has never been useful on them in any circumstances, since the very launch of the IBM PC, 45 years ago.
Today, the XOR trick might have remained useful for swapping general-purpose registers in some microcontrollers, but in the most popular ISAs, like ARM-based or RISC-V, most GPRs are equivalent, so the need to swap them arises very rarely, only in certain kinds of loops, and even there swapping can frequently be avoided by unrolling the loops.