Hacker News new | ask | show | jobs
by afternooner 4396 days ago
Okay, can someone who understands this explain it a little better. My understanding is that ARC (or automatic reference counting) wasn't garbage collection because there isn't any garbage collection. I had thought that ARC was a compiler optimization that would analyze your code and add the deconstructor code while compiling based off some possible usage graph or whatnot. ARC therefore is not garbage collection. Yes, it's memory management, but to link the two as the same is wholly inaccurate. Please chime in to let me know how wrong I am.
3 comments

This is not how ARC works. ARC keeps a count (at runtime) of the number of references to dynamically allocated objects. Every time you create a new reference to an object, the count is incremented, and every time a reference goes out of scope, the count is decremented. When the count reaches zero, the memory is either freed immediately or marked as free-able for later.

Your bit about compile-time code analysis to automatically insert the cleanup code when data is no longer reachable is actually pretty similar to what Rust does today with owned pointers. The downside is that this doesn't work with "normal" code -- you need certain annotations for the compiler to be able to perform this task correctly. In Rust this is done via region pointers (lifetime parameters) and borrow checking.

The canonical text on garbage collection by probably the most respected person in the field considers reference counting (automatic or otherwise) to be GC (http://gchandbook.org/contents.html). I know that's appealing to authority, but what else can you use to prove the definition of a term? Also see [1], which argues these algorithms are all the same thing anyway.

[1] D. F. Bacon, P. Cheng, and V. T. Rajan, “A Unified Theory of Garbage Collection,” presented at the Proceedings of the 19th Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2004.

While we can acknowledge that the more expansive definition of "garbage collection" includes ARC, we can still simultaneously hold another definition of "gc" to not include ARC when discussing Apple & Swift. That's the way Apple documentation and most others are using that term.

Apple docs:

"Garbage collection is deprecated in OS X Mountain Lion v10.8, and will be removed in a future version of OS X. Automatic Reference Counting is the recommended replacement technology."[1]

[1]https://developer.apple.com/library/ios/releasenotes/objecti...

[2]http://lists.apple.com/archives/objc-language/2011/Jun/msg00...

If we don't use the more common understanding of gc, Apple's documentation doesn't make sense. If we do a substitution of Apple's verbage using ARC==GC, we get nonsense such as:

"Garbage collection (which includes ARC) is deprecated in OS X Mountain Lion v10.8, and will be removed in a future version of OS X. Automatic Reference Counting (which is also part of garbage collection) is the recommended replacement technology."

With the insistence on ARC==GC, we'd have to parse that sentence as garbage collection is being removed and replaced with garbage collection.

It would be nice if we not redefine the meaning of computer terms that have be understood for decades. I know it's semantics, but I take the definitions I learned 20 years ago in my college comp-sci textbooks as the agreed upon definition.
Others would take the definitions They learned 10 or 50 years ago in their college comp-sci textbooks as the agreed upon definition, and those differ.

For example, what I learned decades ago doesn't include that 2004 paper that convincingly shows reference counting to be one end of a scale that has pure GC at the other end.

Yet, that paper made me realize that reference counting is, in some sense, garbage collection.

On the other hand, I see no big problem in having an ambiguous term. That happen all over the world, also in science. Chemists have 'alcohol' (ethanol) vs 'an alcohol' (a family of compounds that includes ethanol), mathematicians have words such as 'algebra', biologists have roses as a family of plants and as a subset thereof, etc.

Before ARC they used manual reference counting. At runtime each object has a count of how many things have declared they have a reference to it. You can increment that count by doing [obj retain] whenever you know you're keeping a reference to it that will outlive the stack frame. When you're done with the object you can do [obj release] to decrement the counter. When the counter hits zero, the memory for the object can be freed (I don't actually know if it's done immediately or added to a list of objects to free later).

What the automatic part of ARC does is, at compile time, analyse the code and automatically add the retain and release messages. If it determines the reference will outlive the stack frame, it adds a retain message. if it determines the reference can no longer be dereferenced (maybe because another object reference is assigned to that variable) it adds a release message.

The runtime characteristics are the same as ARC, it just means the programmer no longer needs to work out where to add the retain and release messages (though they still need to think about reference cycles).