| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tsimionescu 688 days ago

No, the more memory you give them, the more efficient they are, and generally they are much more efficient than an equivalent program using malloc() + free() for every piece of memory that gets allocated.

Take the extreme case of a program where every allocation is permanent. The GC allocation will be much much faster than malloc() (GC allocation is normally just bumping a pointer to the end of the heap, while malloc() typically does a lot more work to segregate objects by size etc), and collection will never run. So, overall, the program will be significantly faster.

Edit: More realistically, a program that typically produces 1:1 live mmeory:junk, but occasionally spikes, will consume 0 GC time in normal operation (the GC will just let the junk sit there), while free() needs to run on all that junk as soon as it was created, otherwise it leaks forever.

Also, the fact that GCs can defer collection to later than the object going out of scope can often make them more efficient than a traditional free(). For example, if you have a large graph data structure that is now junk, free() needs to do work that's O(size of graph) to free all the nodes. A copying collector, the most popular design, never even looks at the graph, so it frees it in O(1). Edit: of course, the converse is also true: if the graph survives N collections, the copying GC has to do O(N * size) work, where malloc()/free() do nothing at all.

2 comments

p_l 688 days ago

This is in fact heavily related to "ancient" Unix approach, and partially why it optimized so hard for many small processes.

PDP-11 unix (without split instruction/data space) had 32kB of memory for userland process, total, plus another 24kB for kernel (8kB are reserved for I/O devices).

The memory management interface that gave us malloc() and free() ultimately boiled down to brk() and sbrk() which simply set a pointer describing where stack and heap met in memory. Using many small programs, you'd have as little as possible of that 32kB used by code, and prefer to simply allocate using bump pointer and never free (better reuse object pools, which is also a technique used with GCs). A program in a pipeline would mostly allocate some stuff at the start plus buffers, and while some of the early bits might become garbage it doesn't really matter because you can't share parts of your 32kB block (or at best you can use 8kB segments). So you do "missile-style GC" and just use a bump-pointer allocator (brk() + sbrk() + sizeof combo), then exit to OS when your work is done and the entire memory space gets cleaned.

EDIT: Some GCs (like in later versions of Symbolics Lisp Machines, or presently in SBCL behind some experimental options) provided multiple arena support explicitly to be able to force-declare everything allocated in an area as garbage.

link

andrepd 688 days ago

GC does force you into a memory layout for your objects that negates any advantage in the allocation routines proper.

link

tsimionescu 687 days ago

I'm guessing you're referring to the fact that the most popular GC languages allocate almost every object on the heap, even when used as a field of another object. This is by no means a constraint of the GC, it is a separate design choice.

And in fact C# has always supported "value objects", which do get allocated in-place, Java is adding the same support probably in the next release, and Go actually defaults to it.

link

quotemstr 688 days ago

No it doesn't. GCs need to be able to traverse objects. They don't impose object layout constraints.

link

pjmlp 687 days ago

Only if speaking about languages without system programing capabilities alongside their GC features.

link