Hacker News new | ask | show | jobs
by simonbrooke 3460 days ago
This is a beautiful essay, and I've really enjoyed it. However, while interesting and inspirational it doesn't all ring true. Lisps from the years before generational garbage collection might spend five percent of their time in GC; in particularly awkward cases it might rise to ten percent. I'm not saying this was desirable: it was not, particularly in interactive processes. But it was nothing like 'usually spent between a third and half of their time running the garbage collector'.

I've seen some very long GCs indeed. I'm sure I've seen fifteen minute GCs. But I've never seen GC eat a third of the processor cycles. I don't believe it.

2 comments

Outside of the case where you run out of memory and start invoking emergency GC sweeps every handful of allocations, of course.
Exactly, which is about as common as nonstop paging on a Unix system, and as sensible as a criticism.
When I was in college in the early 90s, unix systems were nonstop-paging quite often, and lisp systems would frequently pause while you were interacting with them, for nontrivial amounts of time, like multiple seconds.

And that was the 90s, after Lisp had been around for decades...

Commercial Common Lisps, or the text based ones developed by students writing master thesis?
That's why my university had a site license of Allegro CL for its SUN SPARC cluster. Thus every student had access to a useful Allegro CL installation.
Please note that SPARCs didn't ship until after generational GC had already been invented and shipped in commercial products.
This feels to me like moving the goalposts.

But come on, I have used emacs for 25 years, and on a daily basis it stalls for annoying amounts of time while I am just doing something simple like editing a .cpp file. Today. In the year 2017.

I'm flattered; thank you.

The performance claim is from my memory of reading LISPCraft, the 1984 edition written before generational GC, where Wilensky claims that something like a third to half of your program runtime is typically spent in GC, so consing is much more expensive than it appears. But I don't have a copy of LISPCraft here, and I read it in probably 1990 or so, so I might be misremembering.

How can we verify the truth about these conflicting claims about LISP systems from the early 1980s? Maybe Gabriel's book has a more definitive answer?

(A possible confounding factor is that, at the time, the frequently-executed parts of Lisp programs were written in a more imperative style to reduce the load on the GC.)

Looking at Gabriel's 1985 book on the performance of Lisp systems (lispm posted a link in another thread: http://rpgpoet.com/Files/Timrep.pdf) we see many benchmarks using 0 GC time. His "Boyer" theorem-proving benchmark (p.133, 142/294) very commonly uses as much time in the "GC" column as in the "CPU" column (½ of the total), twice as much on SAIL (⅔ of the total), and almost never less than about half as much (⅓ of the total).

I'd like to claim that one of the four configurations of VAX Franz Lisp shone here, since it spent only about ⅐ of its total execution time in the GC. However, it spent roughly as much absolute time in the GC as the other three Franz Lisp configurations; it's just that it took 6× as long as normal to do the non-GC part of the benchmark; "normal", for context, was several times slower than SAIL. It's not clear that this counts as a performance improvement in any sense!

But, since many of his benchmarks use 0 GC time, I don't think I can justify the statement that GC just before the introduction of generational GC normally used 50% of a program's total run time; most programs did not consist primarily of evaluating tree-rewriting rules, which obviously are allocation-intensive. (Boyer isn't the worst, though. On Vaughan Pratt's Deriv symbolic differentiation benchmark, SAIL spends almost 90% of its total CPU time in the GC, though some other collectors handle it better.)

Reading through some of the other benchmarks, Franz Lisp commonly comes out rather GC-heavy, though less so than SAIL; on div2, for example, Franz spends 80% of its time in the GC. It may be significant that Wilensky was using Franz to write the book I got my original impression from.

I'm somewhat surprised that the FFT benchmark is written to use GC time, since normally I think of FFTs as not requiring any allocation at all, much less significant numbers of allocations of separate objects...

So, in conclusion, I do think it's justifiable to say that there were significant classes of programs for which all Lisp systems continued to spend ⅓ to ½ of their time in the GC, or 80% to 90% in bad cases, right up until the introduction of generational GC shortly after Gabriel's book. But it's probably not justifiable to apply that statement generally to all Lisp programs. Your 5%–10% number sounds quite reasonable as an overall average.

> I'm somewhat surprised that the FFT benchmark is written to use GC time, since normally I think of FFTs as not requiring any allocation at all, much less significant numbers of allocations of separate objects...

Keep in mind:

a) not every machine had FP hardware, so FP instructions might be emulated in software.

b) dealing with floating point numbers may mean allocate float objects. For example not every Lisp had a direct array of floats. The array then would be a general array pointing to float objects on the heap.

c) not every compiler/interpreter was clever enough to keep floats in registers (or similar) while doing float computations.

Even today, allocating floats can be a problem in Lisp applications.