Hacker News new | ask | show | jobs
by kragen 3458 days ago
Thank you for the link to Gabriel's book! I paged through it just now and summarized what I found in https://news.ycombinator.com/edit?id=13307012.

The claims being debated here are, as I understand them, my half-remembered claim that it was common to spend ⅓ to ½ of your execution time in the garbage collector until generational GC, and Simon Brooke's claim that actually 5% was normal and 10% was in pathologically bad cases. I think Gabriel's book shows conclusively that allocation-heavy programs could easily spend ⅓ to ½ of their time in GC, or 80% to 90% in pathological cases. But Brooke could still be right about the typical case, since when allocation is expensive, people avoid it in performance-critical code when possible.

I agree that the problem mostly went away once generational (aka ephemeral) GC was shipped. In the essay, I dated that to Lieberman and Hewitt in 1980, but in a more recent discussion with Allen Wirfs-Brock, I learned that actual generational GC didn't ship until 1986. (And Lieberman and Hewitt's paper, though submitted in 1980, wasn't published until 1983.)

Given the question of whether non-generational GC would cost more like 5% or more like 25% of your performance, the performance of generational GCs is somewhat of a false trail. Hopefully nobody was claiming generational GCs would eat double-digit fractions of your CPU time. I wasn't, at least.

1 comments

See also ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1417.pdf. for an early perspective on generational GC.

Ephemeral GC btw. means something slightly different. An ephemeral GC is mostly concerned with the detection and collection of short lifetime objects in main memory.

The main problem is that architectures, implementations, actual computers, programs were so different, that you could find all kinds of numbers. Even a generational GC can get extremely busy and may not prevent from having full GCs over large virtual memory from time to time. It's also a huge difference if the GC was incremental (the work was spread over the whole compuation) or if the system stopped for much of the GC work.