Hacker News new | ask | show | jobs
by bjourne 3037 days ago
It never gets as fast as stack allocating variables. Consider a loop:

while(1) { Integer n = new Integer(....) vs int n = ... }

A C compiler would allocate the n variable on the stack, making the "allocation" completely free. But in a GC:ed language, the n variable would be bump allocated once every loop. That wouldn't in itself be so costly, but every so often, a GC cycle would be needed as the garbage accumulates. Furthermore, in C the address of the n variable stays in place while in a GC:ed language it moves a bit for each loop.

That is why escape analysis is a fruitful optimization. It takes heap allocated objects and attempts to stack allocate them, similar to how a C compiler would do it.

2 comments

To be fair, I said "approximately" The cost of a nursery GC in SBCL is O(n+m) where n is the number of live objects in the nursery and m is the number of GC roots. If this code is all you are running then n will be 1, so it's just scanning the roots, which only happens when you have allocated enough integers to fill the nursery, so gets amortized across thousands of loop iterations.
> But in a GC:ed language, the n variable would be bump allocated once every loop.

Citation needed. I don't know of any GCed language that heap-allocates local variables in this way (well, maybe SML/NJ does, but I doubt it). Certainly not Java or any Common Lisp I've ever used.

But I agree with your larger point: heap allocation is never quite as fast as stack allocation, once you factor in the additional GC load. I don't actually know how close it gets with modern collectors; would love to see some numbers.

C# does and Java did before the escape analysis optimization became default. You can find numbers in this old article from 1999: https://www.cc.gatech.edu/~harrold/6340/cs6340_fall2009/Read...

"and the overall execution time reduction ranges from 2% to 23% (with a median of 7%) on a 333 MHz PowerPC workstation with 128 MB memory."

The benchmark is 20 years old so it is kind of out of date. I don't know of any modern benchmarks. I suspect that the difference would be much bigger nowadays because programmers don't avoid allocating small local objects as much.

Oh, now I understand what you're saying. I got thrown off when you said "the address of the variable n [...] moves a bit for each loop". This isn't quite right; it's not the address of the variable n itself that changes, it's the address of the allocated object that it points to.