| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by physguy1123 2612 days ago

I can attest to the basic lock-free spsc queues being far faster than a basic spinlocking version of the same queue, on xeon-class X86 at least.

LOCK'ed instructions (one might use lock xchg or lock cmpxchg) perform a full memory barrier, which has quite a few performance implications if there are any outbound loads/stores. Further, if the cache line containing the spinlock is in anything except an exclusively-held state on the writing core (it will usually be on a different core or in a shared state), simply acquiring the lock will stall for ~25-30ns+.

On the other hand, a simply spsc lock-free queue can hit <10ns per message with a latency about equal to the core-core communication latency.