Hacker News new | ask | show | jobs
by kragen 1710 days ago
It's worth mentioning that (1) typically even ordinary reference counting avoids large pauses, because typically objects that go out of scope aren't the sole owners of a huge object graph, and that this is often but not always a significant improvement over (other simple kinds of) garbage collection; (2) there's a readily available tradeoff with reference counting known as "deferred decrement" which can meet hard real-time deadlines --- but at the cost of sacrificing the relatively predictable and compact memory usage you usually get from reference counting.

Deferred decrement enqueues objects whose reference counts drop to zero onto a "destruction queue" instead of deallocating them immediately. Every time you allocate an object, you work through some fixed amount of the destruction queue, such as two objects, unless it's empty; this amounts to destroying each of the references in the destroyed objects, decrementing whatever objects they refer to. This may drop those objects' reference counts to zero, in which case you add them to the destruction queue. This is very simple, it strictly limits the amount of work needed per allocation or deallocation, thus completely avoiding any large pauses, and it's guaranteed to empty out the queue eventually.

However, it's still reference counting, so it's still inefficient, and it still uses an unpredictable amount of memory, which means that any allocation can fail. Many real-time systems are not amenable to recovering from failures.

1 comments

The main difference in favour of GC in the first case is that GC gets you centralised place to control it.

Generally GC offers more explicit, centralised control over allocation/deallocation, and I've seen it favoured over other schemes if dynamic memory management is necessary in critical code.

That's interesting! Is the idea that, for example, your interrupt handler can freely allocate memory and update pointers and stuff, and you make sure to run the GC often enough that there's always enough memory available for the interrupt handler when it fires? Has this been written up anywhere?

I usually think that the main advantage of GC is not a performance thing; it's that your program can pass around arbitrarily large and complex data structures as easily as it can pass around integers, without you having to think about how much memory they use or when it gets freed. Of course, not thinking about that can easily result in programs that use too much memory and die, or run your machine out of RAM; but for very many programs, either "compute a correct answer" or "die" are acceptable outcomes. You wouldn't want to program your antilock braking system that way, but lots of times people are willing to put up with programs crashing occasionally.

For very tight code, like interrupt handler context, you usually avoid allocation at all (with some object caches to provide storage possibly), but yes.

Metronome is all about partitioning run-time into static slice of GC/mutator ensuring that mutator code has predictable performance - it is slower than if mutator was running full-tilt, but it means that unless you manage to horribly exceed GC performance in collection, you're not going to see a pause.

That said, GCs are faster than manual management usually thanks to pauses - if you look at total runtime, wall-clock time, and not latency. The most common algorithms trade latency for throughput and total wall-clock time.

Yeah, that's what I thought about interrupt handlers. (Also if you're allocating in your interrupt handler, your background thread can't allocate without either disabling interrupts or using some kind of lock-free synchronization.)

I still haven't read the Metronome paper.

I don't know if I'd go so far as to say GC is usually faster than manual memory management, even if we measure by throughput, but at least "often faster" is justifiable. I think usually you can find a way for manual memory management to be faster.