Hacker News new | ask | show | jobs
by Doradus 4097 days ago
"I wish that when GC encountered a cycle it would just tell the programmer that they now have a leak..."

Wow. Overreact much? :-)

Garbage collection works great for one use case: managing heap memory. Please, let's not return to a universe where you need to free each chunk of memory piecewise. That makes freeing memory O(n) in the number of chunks freed, where modern collectors can free unlimited numbers of objects in O(1) time. Compared to this, malloc and free are a little like keeping score in soccer by tracking every action that does not result in a ball entering a net.

The trouble is finalizers. Finalizers suck. They tie the management of one resource (say native memory or file handles) to an unrelated resource (heap memory). This has the terrible consequences highlighted by this story, plus others. It keeps the native resource tied up until the collector decides to run, for one thing. It also harms performance in secondary ways even when the finalizer isn't running. For example, finalizers typically defeat escape analysis, meaning objects with finalizers can't be stack-allocated.

It's a good idea to use finalizers mostly for detecting invalid final states of objects (like leaking resources), as long as you stay aware that finalizers can slow down your program in unexpected ways.

[These are my personal opinions, not those of my employer.]

3 comments

>That makes freeing memory O(n) in the number of chunks freed

A GC can't outdo that. For everything the GC frees, it has to do some work to answer the question, "Should I free this or not?" Even if that question can magically be answered with a single CPU instruction, that's still O(n) answers in the number of chunks freed.

I think you're referring to a mark and sweep collector. Java's garbage collector uses better algorithms. The collector never even looks at objects that are not referenced from the root set, and an allocation operation is nothing but a pointer bump. Even Cheney's algorithm from the early 1970s works this way, and collectors have only improved since then.
Except that the OS zeros memory when it's released...
The JVM doesn't release the memory to the OS when garbage is collected; only when the heap shrinks. Any zeroing the OS might do is proportional to the size change in the heap, not to the number of objects freed.
Good point.

However, that brings up a related issue - namely that the JVM requires all (or close enough to all) object memory to be initialized. And given that the amount of memory required to be initialized is at minimum linear w.r.t. the number of objects required to be initialized...

Work done is still O(n) minimum w.r.t. the number of objects allocated regardless.

Well, it's hard to deny that the work done by the program is O(n) (or even Ω(n)) in the number of objects created by the program. That's almost tautological.

The thing that is interesting about the asymptotic complexity of garbage collection, though, is that it gives you a peek into the future of GC overhead as heaps get larger.

Consider a program with some bounded steady-state memory consumption, and a GC that takes O(1) time to free unlimited amounts of memory. In such a system, you can reduce your GC overhead arbitrarily simply by enlarging the heap. A larger heap gives a larger time interval between collections, yet doesn't make the collections any slower. Don't like 6% overhead? Fine. Make the heap 10x larger and you have 0.6% overhead.

This time-space trade-off is always available to the end user of such a system. It's certainly not available with more primitive systems like malloc+free.

[These are my personal opinions, not those of my employer.]

Yeah, managing memory with malloc and free directly was a nightmare. Back when I worked with Java, however, GC still suffered from unpredictable performance as well as a requirement to have much larger heaps. Are these now mostly solved problems? If not, what do you think of something like this (syntax aside):

http://en.wikipedia.org/wiki/Smart_pointer#C.2B.2B_smart_poi...

My personal experience with similar such systems was that even when performance was not a concern the cost of having to untangle and document my reference structure was well worth the time.

Has any language ever gotten the balance of tradeoffs for finalizers right for strong GC?

> finalizers typically defeat escape analysis

Why is that? I would have thought they would work really well together - if an object such as a file stream doesn't escape you could inline and run the finaliser after the last use of the object goes out of scope.

Or do you mean they defeat current implementations of escape analysis?

It's hard to say. The finalization spec is subtle.

To make this work, the JIT would need to perform escape analysis on the finalizer too (since the finalizer can make an object escape). If that analysis succeeds, then I suppose the object could be stack-allocated and the finalizer called directly. Even then, though, you would have to think carefully about all the subtle ramifications, like the consequences running a finalizer on an application thread instead of the finalizer thread. (What if the finalizer grabs a lock in an attempt to protect itself from the application code? Now this becomes a recursive lock by one thread, and so both the application code and the finalizer can enter it at the same time!)

If this ever became important enough, it's possible the JIT developers could get this right in every case with enough effort invested. I wouldn't count on that happening any time soon.

[These are my personal opinions, not those of my employer.]

If this stuff isn't escaping then the JIT can prove that it isn't using a lock.
Actually, synchronization is not an escape. Objects can be stack-allocated in methods that do synchronization.
Yes I know - so why did you bring it up as an example of something that can prevent EA?
(Whoops, I've misunderstood you twice. I think my other reply answered the wrong question. Let me try again.)

I didn't say synchronization can prevent EA. I said that calling a finalizer on an application thread can defeat the mutual exclusion that would usually be guaranteed by Java's locking. If the app thread holds a monitor when the finalizer is called, that would usually make the finalizer block until the app exits the monitor. If they're on the same thread, that becomes a recursive monitor enter on the same thread, which doesn't block.

Even if you and I solve this particular problem, these are only the problems that occurred to me off the top of my head. With these kinds of subtle, thorny issues lurking around every corner, I wouldn't expect JIT development teams to give this any real attention unless there's some workload where it matters.

[These are my personal opinions, not those of my employer.]

I must have misunderstood you. You made a statement that I parsed as follows:

"If (this stuff isn't escaping) then (the JIT can prove that it isn't using a lock)."

That is logically equivalent to:

"If not (the JIT can prove that it isn't using a lock) then not (this stuff isn't escaping)."

Hence, I was explaining that using a lock doesn't count as an escape. But apparently I just misunderstood your meaning?