| HN Mirror

The OS IPC mechanism quite likely does not use a lock free queue -- or at least, not in quite the same way as I think the grandparent post refers to.

Using a well implemented ring buffer [+] can get enqueue operations down to a few instructions and something like two memory fences.

The overhead of IPC, which wakes up the kernel scheduler, switches the processor back and forth between privilege modes a few times on the way, knocks apart all the CPU cache and register state to swap in another process, while the MMU is flipping all your pages around because these two processes don't trust each other to write directly into their respective memory... is not going to have quite the same performance characteristics.

An moment in the history of logging is java's log4j framework, which, within a single process, used exclusive synchronization. When this was replaced by a (relatively) lock-free queue implementation, throughput increased by orders of magnitude. (Their notes and graphs on this can be found at https://logging.apache.org/log4j/2.x/manual/async.html .) This isn't an exact metaphor for the difference between a good lockfree ringbuffer and IPC either, but it certainly has some similarities, and indeed ends with a specific shout-out to the power of avoiding "locks requiring kernel arbitration".

[+] The "mechanical sympathy" / Disruptor folks have some great and accessible writeups on how they addressed the finer points of high performance shared memory message passing. http://mechanitis.blogspot.com/2011/06/dissecting-disruptor-... is one of my favorite reads.