| This is more specifically about any reference counted system, though it also includes unique owner approaches and others. Namely, when you free an object, whether it involves manually written code to release anything it holds recursively, or through some form of reference counter, you do not know the size of object graph you're freeing. If you add any custom release logic to the mix (destructors, any kind of shutdown logic, etc), you also don't know how much extra work unrelated to simple memory management will be also involved. This means that naive manual memory management, or reference counting systems, have usually unbounded and unpredictable pause times (in addition to various costs involved in implementation details, whether it's malloc()/free() or refcounting). You do not know if object X going out of scope isn't the sole owner of a huge object graph. This is one of (though not only) reason manual memory management and refcounting aren't considered good options in real-time compared to static allocation of everything (object pools, static stack analysis & limits) or RT Garbage Collector algorithms, though of course it depends on how deep into the rabbit hole you go. Garbage Collector systems tend to have more control over pauses, but majority of algorithms used optimise for throughput, not latency or predictability. Thus you have "huge GC pause" as GC uses fastest space/time algorithm and "pays down the debt" for making allocation stupidly fast (often single Compare-And-Swap operation, sometimes not even that if each thread uses separate nursery and can keep local heap-end pointer). However you can also setup environment towards low latency and predictability, probably the most famous example being IBM Metronome (available in J9 and recently open sourced), which uses a static quanta of time for GC and minor variability is the noise of multiple threads of execution synchronizing. This works by dividing available time between mutator (your "work" code) and collector (GC) statically, so that GC will for example always run for 30ms then mutator for 70ms then GC for 30ms and so on. Similar techniques exist for amortizing pathological performance of reference counting by pushing actual deallocation into separate collector thread or similarly scheduled slice (depends on whether you're fine with possibly unpredictable contention between two threads or want more strict timing). A lot of very hard real time is more about predictability in the end. |
Deferred decrement enqueues objects whose reference counts drop to zero onto a "destruction queue" instead of deallocating them immediately. Every time you allocate an object, you work through some fixed amount of the destruction queue, such as two objects, unless it's empty; this amounts to destroying each of the references in the destroyed objects, decrementing whatever objects they refer to. This may drop those objects' reference counts to zero, in which case you add them to the destruction queue. This is very simple, it strictly limits the amount of work needed per allocation or deallocation, thus completely avoiding any large pauses, and it's guaranteed to empty out the queue eventually.
However, it's still reference counting, so it's still inefficient, and it still uses an unpredictable amount of memory, which means that any allocation can fail. Many real-time systems are not amenable to recovering from failures.