Hacker News new | ask | show | jobs
by mcartyem 4756 days ago
(thank you for the willingness to write such a detailed response)

There exists a point on the memory management continuum where management starts being more automatic (gc) than manual (malloc/free). I would like to understand the forces surrounding this specific point, right before the scale tips towards automatic.

If you tried to build a dynamic language without automatic management, what would break and why?

2 comments

The biggest problem is scope control; as you start having closures that get passed around freely, those closures drag values along with them that you can't collect. It is not impossible to write this with malloc/free, but I've played that game and it's not very fun. And remember, what seems easy in one little blog post isn't easy in a real program where you've got dozens of the little buggers flying every which way. (And by dozens, I mean dozens of distinct different types of closures from different sources, not just dozens of instance of the same code instance.)

Many of the dynamic languages fling closures around with wild abandon, often without you even realizing it. (One object has a reference to another object which is from the standard library which happens to have a closure to something else which ends up with a reference to your original object... oops, circular reference loop. Surprisingly easy.)

There isn't much technically impossible with malloc/free (though IIRC there are indeed some cases that are both sensible and actually can't be correctly expressed in that framework, but the example escapes me), but there's lots of practical code where the cost/benefit ration goes absurdly out of whack if you're trying to write the manual freeing properly. It's hard to write an example here, because it arises from interactions in a large code base exceeding your ability to understand them. It's like when people try to demonstrate how dangerous old-style pthreading is; even though the problem is exponentially bad in nature, anything that fits in a blog post is still comprehensible. The explosion doesn't happen until you got to real code.

I can see how closures would take some work. Thanks.

There's evidence garbage collection was not the desired solution but a plan B. McCarthy writes the reason reference counting was not implemented was a hardware limitation [1]:

"Since there were only six bits left in a word, and these were in separated parts of the word, reference counts seemed infeasible without a drastic change in the way list structures were represented. (A list handling scheme using reference counts was later used by Collins (1960) on a 48 bit CDC computer).

The second alternative is garbage collection..."

[1] - http://www-formal.stanford.edu/jmc/history/lisp/node3.html#S...

If the language was also dynamically typed, that could also lead to some interesting problems. What is the actual size used to represent a string, or an in, or a double in the language? What happens when you want to compare/convert different types?

Either the language provides these capabilities, in which case it's doing some memory management of it's own, or you write them, in which case it wasn't necessarily all that dynamic to start with.

I imagine it might look like what you get when you try to write a dynamic language in C (Perl, Ruby, Python), a lot of macros that automate what's going on to the degree you are mostly writing some macro based DSL.