|
|
|
|
|
by derefr
4281 days ago
|
|
> You can also offload your logging to a dedicated thread and then use a lock free queue to increase your performance even more. Or use syslog() like a sane person, and let that "extra thread" live inside the OS IPC mechanism. (Or stdout/stderr like a modern sane person, and let upstart/systemd/docker/etc. push your logs to syslog if that's where it feels like pushing them.) |
|
Using a well implemented ring buffer [+] can get enqueue operations down to a few instructions and something like two memory fences.
The overhead of IPC, which wakes up the kernel scheduler, switches the processor back and forth between privilege modes a few times on the way, knocks apart all the CPU cache and register state to swap in another process, while the MMU is flipping all your pages around because these two processes don't trust each other to write directly into their respective memory... is not going to have quite the same performance characteristics.
An moment in the history of logging is java's log4j framework, which, within a single process, used exclusive synchronization. When this was replaced by a (relatively) lock-free queue implementation, throughput increased by orders of magnitude. (Their notes and graphs on this can be found at https://logging.apache.org/log4j/2.x/manual/async.html .) This isn't an exact metaphor for the difference between a good lockfree ringbuffer and IPC either, but it certainly has some similarities, and indeed ends with a specific shout-out to the power of avoiding "locks requiring kernel arbitration".
--
[+] The "mechanical sympathy" / Disruptor folks have some great and accessible writeups on how they addressed the finer points of high performance shared memory message passing. http://mechanitis.blogspot.com/2011/06/dissecting-disruptor-... is one of my favorite reads.