| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bjourne 3037 days ago

It never gets as fast as stack allocating variables. Consider a loop:

while(1) { Integer n = new Integer(....) vs int n = ... }

A C compiler would allocate the n variable on the stack, making the "allocation" completely free. But in a GC:ed language, the n variable would be bump allocated once every loop. That wouldn't in itself be so costly, but every so often, a GC cycle would be needed as the garbage accumulates. Furthermore, in C the address of the n variable stays in place while in a GC:ed language it moves a bit for each loop.

That is why escape analysis is a fruitful optimization. It takes heap allocated objects and attempts to stack allocate them, similar to how a C compiler would do it.

2 comments

aidenn0 3037 days ago

To be fair, I said "approximately" The cost of a nursery GC in SBCL is O(n+m) where n is the number of live objects in the nursery and m is the number of GC roots. If this code is all you are running then n will be 1, so it's just scanning the roots, which only happens when you have allocated enough integers to fill the nursery, so gets amortized across thousands of loop iterations.

link

ScottBurson 3037 days ago

> But in a GC:ed language, the n variable would be bump allocated once every loop.

Citation needed. I don't know of any GCed language that heap-allocates local variables in this way (well, maybe SML/NJ does, but I doubt it). Certainly not Java or any Common Lisp I've ever used.

But I agree with your larger point: heap allocation is never quite as fast as stack allocation, once you factor in the additional GC load. I don't actually know how close it gets with modern collectors; would love to see some numbers.

link

bjourne 3037 days ago

C# does and Java did before the escape analysis optimization became default. You can find numbers in this old article from 1999: https://www.cc.gatech.edu/~harrold/6340/cs6340_fall2009/Read...

"and the overall execution time reduction ranges from 2% to 23% (with a median of 7%) on a 333 MHz PowerPC workstation with 128 MB memory."

The benchmark is 20 years old so it is kind of out of date. I don't know of any modern benchmarks. I suspect that the difference would be much bigger nowadays because programmers don't avoid allocating small local objects as much.

link

ScottBurson 3036 days ago

Oh, now I understand what you're saying. I got thrown off when you said "the address of the variable n [...] moves a bit for each loop". This isn't quite right; it's not the address of the variable n itself that changes, it's the address of the allocated object that it points to.

link