Hacker News new | ask | show | jobs
by titzer 1887 days ago
Unfortunately the space is littered with confounding variables. For example, just measuring the impact of GC overhead is bound to be tricky, because a trashy program (one that allocates a lot of garbage) is going to make GC look bad in comparison to a program that is careful to reuse data structures. And that's mostly because of the second-order effects of the memory hierarchy, and not so much about the raw cost of allocation or deallocation! A lot of language library choices contribute to trashiness; e.g. in Java it is very difficult to avoid making lots garbage if you use essentially any of the JDK. You can't even parse integers or strings effectively without multiple copies. I really don't know how Go's libraries look in that regard, but the truism generally holds that the more bullet-proof you try to make APIs, the more defensive copies you end up making.

Is the antidote to those inefficiencies an ownership model that forces you to reason about mutability? I don't know. I kind of think no, primarily that it is a performance consideration that infects the API and spreads everywhere; it's a distraction. Is it instead to have all immutable data structures and an insanely smart compiler that replaces copying with in-place mutation, so that pure functional languages compete with highly tuned, pervasively mutable code? I kind of also think, no. And primarily that's because performance cliffs get worse and harder to predict, the smarter your compiler is.

The mutability/ownership question is confounded with allocation and deallocation. The latter really should never be on any programmer's mind, IMHO. In Rust, it seems there isn't much support for decoupling the two, e.g. by having an automatic garbage collector. That's also an unfortunate reality forced on language implementers by the fact that LLVM has steadfastly refused to support stackmaps as a first class concept for more than 15 years. Many, many projects have died because of LLVM's stackmap support being lacking or broken. That's unfortunate because precise stackmaps are a key enabler for many GC techniques, and without them, you end up with some form of conservatism that make certain optimizations impossible, and forces a particular design for nurseries.

3 comments

Yeah, you have to get into all of these details to have a productive conversation. Likewise, I don't actually think a lot of this inherently has to do with performance, basically, I could make a similarly lengthy comment about how

> I kind of think no, primarily that it is a performance consideration that infects the API and spreads everywhere; it's a distraction.

is something I fundamentally disagree with; unfortunately, I don't really have the time at this moment to even get it into the level that you have here, but it's a good conversation to have, and a needed level of details to even have a good conversation.

Seconding the disagreement with the idea that mutability tracking is a distraction. I would give anything to have true deep-immutability enforced by TypeScript, but unfortunately the language semantics make it virtually impossible. In Rust, it might be my favorite feature.
Interesting. We definitely should have this conversation in the proper detail, because my estimates of Rust's language design priorities must be off.
>For example, just measuring the impact of GC overhead is bound to be tricky, because a trashy program (one that allocates a lot of garbage) is going to make GC look bad in comparison to a program that is careful to reuse data structures.

This is completely irrelevant because stop the world pauses are global and affect your entire application no matter how careful you write your code. Every library you use has to be carefully written, every single line of code you write has to be carefully written regardless of whether it is time critical or not, this is a complete dead weight for the entire application.

With isolated GC heaps your time critical code will be isolated from non critical code. In practice the easiest way to do this is to just launch a separate application. This hurts a lot with the JVM because it wants to own your entire server (mostly RAM) for some inexplicable reason and you might as well write that part in C++ if you do inter process communication anyway.

Thing is, in order to have pervasive reuse of data structures (mitigating the overhead of GC) while maintaining safety and correctness, you have to track uniqueness and mutability throughout the program. At that point, you've got something that's practically indistinguishable from borrowck.

The real use case for GC is managing data where there's no well-defined "ownership" pattern, such as when dealing with general graphs (e.g. in a symbolic computing or GOFAI application). That's a remarkably niche domain.