| HN Mirror

In theory, a GC should never be faster than manual memory management. Anything a GC can do can be done manually, but manual management has much more context about appropriate timing, locality, and resource utilization that a GC can never have. A large aspect of performance in modern systems is how effectively you can pipeline and schedule events through the CPU cache hierarchy.

There are a few different ways a GC impacts code performance. First, even low-latency GCs have a latency similar to a blocking disk op or worse on modern hardware. In high-performance systems we avoid blocking disk ops entirely specifically because it causes a significant loss in throughput, instead using io_submit/io_uring. Worse, we have limited control over when a GC occurs; at least with blocking disk ops we can often defer them until a convenient time. To fit within these processing models, worst case GC latency would need to be much closer to microseconds.

Second, a GC operation tends to thrash the CPU cache, the contents of which were carefully orchestrated by the process to maximize throughput before being interrupted. This is part of the reason high-performance software avoids context-switching at all costs (see also: thread-per-core software architecture). It is also an important and under-appreciated aspect of disk cache replacement algorithms, for example; an algorithm that avoids thrashing the CPU cache can have a higher overall performance than an algorithm that has a higher cache hit rate.

Lastly, when there is a large stall (e.g. a millisecond) in the processing pipeline outside the control of the process, the effects of that propagate through the rest of the system. It become very difficult to guarantee robust behaviors, safety, or resource bounds when code can stop running at arbitrary points in time. While the GC is happening, finite queues are filling up. Protecting against this requires conservative architectures that leave a lot of performance on the table. If all non-deterministic behavior is asynchronous, we can optimize away many things that can never happen.

A lot of modern performance comes down to exquisite orchestration, scheduling, and timing in complex processes. A GC is like a giant, slow chaos monkey that randomly destroys the choreography that was so carefully created to produce that high-performance.