Hacker News new | ask | show | jobs
by gpderetta 529 days ago
On x86 there is sometimes (depending on the specific microarchitecture) an extra cycle additional latency when using an integer operation on a xmm register last used with a float operation.

I have seen it explained as the integer and foat ALUs's being physically distant and the forwarding network needing an extra cycle to transport the operands.

1 comments

This is correct, but it’s happily pretty rare for it to matter in practice (because the domain bypass penalty is small and does not directly impact throughput, only latency).