Hacker News new | ask | show | jobs
by jfr 5534 days ago
> A GC system with explicitly visible reference counts (and immediate freeing) with language support to make it easier to get the refcounts right [...]

To be a little pedantic on the subject, such a system (reference counting and immediate freeing) is a form of automatic memory management, but it is not GC in any way. Garbage collection implies that the system leaves garbage around, which needs to be collected in some way or another. The usual approach to refcounting releases resources as soon as they are no longer required (either by free()ing immediately or by sending it to a pool of unused resources), thus doesn't leave garbage around, and doesn't need a collector thread or mechanism to.

There are partial-GC implementations of refcounting, either because items are not free()d when they reach zero references, or to automatically detect reference loops which are not handled directly.

I agree with Torvalds on this matter. GC as it is promoted today is a giant step that gives programmers one benefit, solving one problem, while introducing a immeasurable pile of complexity to the system creating another pile of problems that are still not fixed today. And to fix some of these problems (like speed) you have to introduce more complexity.

This is my problem with GC. I like simplicity. Simplicity tends to perform well, and being simple also means it has little space for problems. Refcounting is simple and elegant, you just have to take care of reference loops, which also has another simple solution, that is weak references. I can teach a class of CS students everything they need to know to design a refcounting resource management system in one lesson.

GC is the opposite: it is big, complex, and a problem that the more you try to fix it, the more complex it becomes. The original idea is simple, but nobody uses the original idea because it performs so badly. To teach the same class how to design a GC system that performs as well as we expect today, an entire semester may not be enough.

7 comments

I agree with Torvalds on this matter.

In a way, I do as well.

GC as it is promoted today is a giant step that gives programmers one benefit, solving one problem, while introducing a immeasurable pile of complexity to the system creating another pile of problems that are still not fixed today. And to fix some of these problems (like speed) you have to introduce more complexity.

There are plenty of contexts where speed is a non-issue. In those cases, GC has been a huge win. The conceptual simplicity is the important part. The cost of the resources that would be saved with explicit and optimized memory management would be far outweighed by the resources required to implement such things.

The original idea is simple, but nobody uses the original idea because it performs so badly.

This is simply not true.

In the context of IO-bound enterprise systems, I've seen generational GC perform admirably, almost magically. As a lark, I've put infinite loops into such apps that do nothing but allocate new objects, and unless you are doing an exceptionally intense operation, you couldn't tell the difference. Properly tuned generational GC can be a truly fantastic seeming thing!

However, I will agree that the concerns Linus highlights are real, and that refcounting systems, like the one in iOS are by far better choices in many contexts.

EDIT: The above system I victimized, I only victimized in the TEST environment, but it was populated with something like 2-week old production data. The application in question is a traditional client/server desktop app used by a major energy company and had 800 active users at the time, handling millions in transactions every minute.

IDEA: If someone had an augmented ref-counting system with a runtime containing an optional cycle-detector and something like LINT but for the runtime reference graph, one would get most of the benefits of GC with the efficiency of the ref-counting system. I half expect someone to tell me that this already exists for Python.

In 2004 Bacon et al. showed that tracing GC and ref-counting GC are special cases of a more general framework, and that it is possible to combine them to get the benefits of both. See "A Unified Theory of Garbage Collection" here:

http://www.research.ibm.com/people/d/dfb/publications.html

Scholar cluster:

http://scholar.google.com/scholar?q=bacon+unified+theory+gar...

Incidentally, it is not even remotely true that reference counting is more efficient than tracing GC in all cases.

> I've seen generational GC perform admirably, almost magically. As a lark, I've put infinite loops into such apps that do nothing but allocate new objects

While I agree that generational GC can perform spectacularly well, what you're describing is close to the case it's optimized for, not close to its worst case. The worst case is that you allocate lots and lots of small objects and then write a pointer to all of them into a tenured garbage object.

> I half expect someone to tell me that this already exists for Python.

Yes, that's how Python works, except that I don't know what you mean by "something like LINT but for the runtime reference graph."

While I agree that generational GC can perform spectacularly well, what you're describing is close to the case it's optimized for, not close to its worst case

Here's the thing: Most of the rest of the app was rather close to the case it's optimized for.

I don't know what you mean by "something like LINT but for the runtime reference graph."

Something that tells you that you've created a reference graph with a cycle, you have a memory leak, or that you're using references in some other stupid or suboptimal way. I'm not even sure if there's a way to automatically detect anything like the last category, though the first two are certainly detectable. Basically, you take most of the infrastructure of GC, and you just turn it into a runtime advisor to warn devs and testers of mistakes.

Great Circle commercialized the Boehm collector back in the 1990s, and I seem to recall that most of their customers were using it to tell them when they had a memory leak or reused freed memory, not to remove the need for reference counting altogether. But it didn't tell you about cyclic references, unless they resulted in a memory leak.
> IDEA: If someone had an augmented ref-counting system with a runtime containing an optional cycle-detector and something like LINT but for the runtime reference graph, one would get most of the benefits of GC with the efficiency of the ref-counting system. I half expect someone to tell me that this already exists for Python.

Or you could do what Rust <https://github.com/graydon/rust/wiki/Language-FAQ>; does, and have different types of objects; stateful and stateless. Stateless object cannot have cycles (since you can't construct them in stateless objects, unless you're in a lazy language), and so reference counting can be used for stateless objects. Stateful objects, which can have cycles, are managed by the garbage collector.

Rust sounds like a really interesting idea, and I'd love to give it a try, but sadly it's still in the "some assembly required" stage as they work on bootstrapping it so the only real projects that are worthwhile to do with it yet are helping out with the bootstrapping process.

> This is my problem with GC. I like simplicity. Simplicity tends to perform well, and being simple also means it has little space for problems. Refcounting is simple and elegant,

Reference counting isn't conceptually any more simple than garbage collection (you still have to indicate how the underlying alloc() and free() operations are implemented). Further, a proper high performance reference counting system isn't any simpler than a high performance GC. In a multithreaded system you have to keep the reference counts in sync without doing a slow atomic operation every time a pointer is passed around, and you need to back the ref counting system with a high performance malloc()/free() that can handle multithreaded allocation and is resistant to memory fragmentation.

For higher-level systems programming (ie not bare metal) I think the approach taken by the Rust language is interesting. Mutable data structures are thread-local and are GCed (GC happens independently per thread). Immutable data structures can be passed between threads, are reference counted and support RAII.

https://github.com/graydon/rust/wiki/Language-FAQ

This is my problem with GC. I like simplicity. Simplicity tends to perform well, and being simple also means it has little space for problems

If your use case is teaching how to implement a garbage collector vs. a refcounting system, it is certainly much simpler to implement refcounting. If you are a systems programmer/writing a VM or compiler and do your work at a bare-metal level, then manual memory management/refcounting is probably your best choice.

For everyone else (let's guess over 90% of the programming population), using high-level GC'd languages is significantly simpler and more productive.

Implementing a stop-the-world, 2-space copying GC, provided you have complete stack, register and type layout data, is trivial. Implementing reference counting is much less so, especially in the presence of threads. Many simple scenarios turn into problems of lock-free programming proportions - and that's just for verifying that memory safety is present, not that the user hasn't introduced bugs with their own threading cock-ups.

(Delphi uses reference counting for interfaces, strings and dynamic arrays, and I am aware of race bugs in strings in particular (which are copy on write); these bugs are hard to fix without murdering performance, yet in practice they are very rare on x86 memory model hardware. So they stay.)

I've implemented both in C++.

See http://svn.boost.org/svn/boost/trunk/boost/smart_ptr/shared_... as an example of reference counting. Add in some atomic increment/decrement primitives and that's literally all there is to implement in one header file.

But even the lightest weight GC collector I could make had a significant C++ implementation file with lots of nontrivial pointer operations and loops in it. Even without scanning the stack for root objects, passing them in manually, I still think there's some amount of non-portable code in there.

That said, I'm using my GC for new stuff when the objects aren't too temporary. We'll see how it works out.

Unless you modified an existing or wrote a new C++ compiler, no, you haven't really (don't forget what you're piggy-backing on when the compiler is invoking ctors and dtors for you too). Boost shared pointers are toys; they don't deal with copy-on-write semantics. C++ exception safety is also hilariously difficult to get right (this adds to the refcounting problem); the language is broken by design.
Unless you modified an existing or wrote a new C++ compiler, no, you haven't really

Not sure which thing you're saying "I haven't really". But yeah, at times I have experimented with code that replaced built-in C++-runtime-library functionality.

(don't forget what you're piggy-backing on when the compiler is invoking ctors and dtors for you too).

Ctor/dtors by themselves don't, by themselves, generally allocate heap objects.

Boost shared pointers are toys; they don't deal with copy-on-write semantics.

I don't think that CoW is an essential feature of every allocation tracking scheme. Still, if you declare shared_ptr<const T> you can copy when you need to.

The std::unique_ptr and "move" semantics in C++11 are filling in some of those gaps in the core language.

C++ exception safety is also hilariously difficult to get right (this adds to the refcounting problem)

Be honest - it is exceptional conditions in all forms of programming are "hilariously difficult". C programs typically handle it sporadically or dedicate 50% of the code bulk (evenly throughout the program) to handling such conditions. C++ exceptions are tools to help you to put all that danger into a single facility which you can then point to and be horrified by. This is a great improvement.

the language is broken by design.

As someone else once replied to me, Yeah, if your definition of broken is awesome. :-)

It just depends on what you want out of your language. I like other languages too.

Of course you're leaving out the malloc()/free() implementation that the ref-counting scheme depends on that the GC doesn't.
Perhaps I was thinking that that's usually provided by the operating system "for free". Hard to imagine an actual nontrivial GC system not needing basic malloc/free at some point anyway. Come to think of it, I think I actually tried implementing something like that once.

Alternatively, you could implement a simple first-fit free-block-list scheme in a small amount of code. It might not look terribly different from something a GC might do anyway.

Still, I think it's a fair point.

And if delphi had complete GC, i would probably still be using it for new projects.
Nothing in the definition of Garbage Collection says it must be a separate post processing step. Therefore reference counting is in fact a garbage collection mechanism. Just because it is an eager algorithm rather than the usual lazy one doesn't change what it does, collect garbage in the system.
Nothing in the definition of Garbage Collection says it must be a separate post processing step.

My impression (based on reading up on the state-of-the-art some years ago) was that an actual concurrent-with-normal-execution GC needed to resort to putting a write barrier in front of the useful thread of execution much of the time. This would cause a noticeable perf hit.

There were architectures designed with hardware support for this kind of thing, but chips without it left them in the dust. Whether or not that is an accident of history or for a reason is an interesting discussion.

Right, and garbage collection is just lazy memory manamgement--you go see it either way, so this whole point is rather pedantic.
I like simplicity. Simplicity tends to perform well, and being simple also means it has little space for problems.

Amen.

I saw an old interview with Chuck Moore a while ago in which he said: I like simplicity and efficiency. That struck me. How often do people put those two things together? We're conditioned to think of them as a tradeoff. But if you can have both, shouldn't we be trying hard for that? Which raises another question: what do you have to give up in exchange for both simplicity and efficiency?

It's because "simple" can be a measure of different things. C is a pretty simple and straightforward language, but it takes 10 times as much code to do anything, which could be said to be quite a bit more complex.
In my experience, it takes an incredible amount of effort and experience to solve a complex problem with simplicity and efficiency. It's well worth it, but there's definitely a cost, and not everyone is capable of it.
That's been my experience too. Part of this difficulty, though, is that it requires going against our training and culture. This raises the question of how much easier it might get with different training and culture.

That's one thing that's so intriguing about Moore. He's a living specimen of an alternate computing history. I sometimes wonder what would have happened if he had been in, say, Backus's position at the dawn of high-level languages.

I tend to feel that the two often go together. After all, "the key to making a program fast is to make it do as little as possible".
> I like simplicity. Simplicity tends to perform well, and being simple also means it has little space for problems.

I like simplicity in my code, I am far less concerned about simplicity in the underlying libraries that I use.

GC makes my code simpler and more understandable. I use manual memory allocation when I absolutely need to, and GC languages when I can. In most cases, that means I use a GC language.