Hacker News new | ask | show | jobs
by notamy 1738 days ago
Bit confused by this part of the article:

> PRO-REWRITE: Rust has manual memory management, so we would avoid the problem of having to wrestle with a garbage collector because we would just deallocate unused memory ourselves, or more carefully be able to engineer the response to increased load.

> ANTI-REWRITE: Rust has manual memory management, which means that whenever we’re writing code we’ll have to take the time to manage memory ourselves.

Isn't part of the point of Rust that you don't manage memory yourself, and rather that the compiler is smart enough to manage it for you?

13 comments

Yes, Rust kinda doesn't fit super cleanly into a very black/white binary here. It is automatic in the sense that you do not generally call malloc/free. The compiler handles this for you. At the same time, you have a lot more control than you do in a language with a GC, and so to some people, it feels more manual.

It's also like, a perception thing in some sense. Imagine someone writes some code. They get a compiler error. There are two ways to react to this event:

"Wow the compiler didn't make this work, I have to think about memory all the time."

"Ah, the compiler caught a mistake for me. Thank goodness I don't have to think about this for myself."

Both perceptions make sense, but seem to be in complete and total opposition.

"Manual vs automatic" is mostly just a semantic problem IMHO. We could say "runtime versus compile time" to be more precise, but maybe there are problems there as well. The more interesting question to me is "how much time/energy do I spend thinking about memory management, and is that how my time is best spent?". In cases of high performance code, you might spend more time fighting with the GC than you would with the borrow checker to get the performance you need, but for everything else the hot paths are so few and far between you're most likely better off fighting with the GC 1% of the time and not fighting anything the other 99%.

The Rust community has done laudable work in bringing down the cognitive threshold of "manual / compile-time" memory management, but I think we're finding out that the returns are diminishing quickly and there's still quite a chasm between borrow checking and GC with respect to developer velocity.

"developer velocity" is also, in some sense, a semantic question. I am special, of course, but basically, if you include things like "time fixing bugs that would have been prevented in Rust in the first place", my velocity is higher in Rust than in many GC'd languages I've used in the past. It just depends on so many factors it's impossible to say definitively one way or another.
I have trouble believing this, at least in any generalizable way. I'm comfortable in both Go and Rust at this point (my Rust has gotten better since last year when I was griping about it on HN), and it's simply the case that I have to think more carefully about things in Rust because Go takes care of them for me. It's not a "think more carefully and you're rewarded with a program that runs more reliably and so you make up the time in debugging" thing; it's just slower to write a Rust program, because the memory management is much fiddlier.

This seems pretty close to objective. It doesn't seem like a semantic question at all. These things are all "knowable" and "catalogable".

(I like Rust more now than I did last year; I'm not dunking on it.)

I know you're not :) I try to be explicit that I'm only talking about my own experience here. I try not to write about my experiences with Go because it was a very long time ago at this point, and I find it a bit distasteful to talk about for various reasons, but we apparently have quite different experiences.

Maybe it depends on other factors too. But in practice, I basically never think about memory management. I write code. The compiler sometimes complains. When it does, 99.9% of the time I go "oh yeah" and then fix it. It's not a significant part of my experience when writing code. It does not slow me down, and the 0.1% of the time when it does, it's made up for it in some other part of the process.

I wish there was a good way to actually test these sorts of things.

This jives very well with my experience. I like writing Rust, but I do so well aware that I could write the same thing in Go and still have quite a lot of time left-over for debugging issues.

I can also get user feedback sooner and thus pivot my implementation more quickly, which is a more subtle angle that is so rarely broached in these kinds of conversations.

The places where I think the gap between Go and Rust is the smallest (due to Rust's type system) are things like compilers where you have a lot of algebraic data types to model--Rust's enums + pattern matching are great here.

I always miss match and options (I could go either way on results, which tend to devolve into a shouting match between my modules with the type system badly refereeing). But my general experience is, I switch from writing in Rust to Go, and I immediately notice how much more quickly I'm getting code into the editor. It's pretty hard to miss the difference.
I don't do much Go, so I can't really compare it with Rust all that well, but I think it's a plausible result.

To take two GC'd languages, I'm proficient in both Java and Scala. It usually takes me a little longer to write something in Scala, but when I'm done, I've almost certainly written fewer bugs in the Scala program than the Java program (I've also written many fewer lines of code, but that's another topic).

For me, it's the type system that helps the most. Given that Rust's type system is much stronger and expressive than Go's, I do expect to write fewer bugs in Rust than in Go. But it does feel like, if I had more experience with Go, I'd be significantly faster writing Go than Rust. (Then again, the more I write Rust, the fewer write-compile-fail-fix cycles I have to go through, and the compiler's ability to accept code as safe improves pretty frequently.)

Still, though (and I know this isn't the question at hand, but...), I personally value greater chances of correctness at compile time way more than development speed. While some types of bugs can be a fun adventure to track down and fix, most bugs I encounter are some mix of boring and annoying. I honestly would prefer to spend 2 weeks building and 2 days debugging over 1 week building and 1 week debugging. I really do find debugging that annoying. (Fuzzy numbers; I don't actually think I'd build 2x as fast in Go as Rust.)

> I personally value greater chances of correctness at compile time way more than development speed

In my experience, I don’t get much additional correctness for the extra effort, but rather I get independence from the GC, which is worth much less to me.

If we’re optimizing for correctness alone, I think development times could improve significantly by swapping the borrow checker for a gc. I know the borrow checker aids in correctness beyond what a gc does, but IMO the returns diminish rapidly. And I’m not sure how well this would work in practice, but maybe you could keep the borrow checker and add a GC, with every reference type being gc<> by default (not sure if that would recoup any of the extra correctness that a borrow-checker affords or not).

> it's just slower to write a Rust program, because the memory management is much fiddlier.

Really depends on what kind of programs do you write. I found that my Rust development gets slowed down only because I have to spend time to create the proper types. Memory management and lifetime problems are very few in my practice (but I can agree that they can swallow a time -- only when you are new though).

It's very much a confusing process. If C-styled memory management is skydiving and Python is parachuting, Rust can feel a bit like bungee-jumping. It's neither working for or against you, but it will behave in a specific way that you have to learn to work around. Your reward for getting better at that system is less mental hassle overall, but it's definitely a strange feeling, particularly if you're already comfortable with traditional memory management.
> At the same time, you have a lot more control than you do in a language with a GC

Are there some examples of that?

Two examples: LockGuard and Box.

Control is primarily exerted over consumers of your API rather than the actual resources. This can be enforced through a combination of Drop implementations, and closures / lifetimes; the classic example is Mutex's LockGuard. In a GC language (eg Go) they give you defer or finally blocks that can accomplish the same thing, but that is always optional and up to other programmers to remember to do. Compare: you can't typically make someone run destructors in a GC language; you also wouldn't be able to guarantee the destructors have run at any particular point in time.

The one area you have more control over actual resources is knowing when memory is freed. Some people need to know when memory is freed, because they have allocated a lot and if they do it again without freeing, they'll run into trouble. To know for sure, simply use a normal owned type or a unique pointer (Box); when it goes out of scope, that's when its destructor is run. No such feature exists in a GC language, because you can never know at compile time when nobody else holds a reference.

As a thought experiment: in JavaScript with WebAssembly, an allocation in WASM can be returned to JS as a pointer. You need to free it, somehow. Can you write a class that will deallocate a WASM allocation it owns when an instance of the class is freed by the JS GC? (Answer: no! You need a new language-provided FinalizationRegistry for that.)

Ah, so it's more about library writer control then about library consumer control? Since for example in Common Lisp, the latter can still be accomplished through declarations, such as DYNAMIC-EXTENT (http://clhs.lisp.se/Body/d_dynami.htm). (Not sure if the former is necessarily related to memory usage control, but you'd probably achieve that type of resource control by exposing only WITH-* macros in your API.)

Maybe D people would have something to say about this as well, but I'm not a D person. What you're describing doesn't seem impossible in D to me, though.

Edit: yes. Library consumers don't get to change much, except where you have generic functions that abstract over a trait like `T: Borrow<T2>`, and then you can pass in any kind of owned or borrowed pointer to T2.

Dynamic-extent appears to be more similar to the "register" hint in C than to anything in Rust, in that it's an implementation-defined-behaviour hint. Rust has no such thing as hinting at storage class. Your variables are either T (stack) or Box<T> (heap) or any other box-like construct involving T. You maintain complete control at all times, nothing is implementation-defined, and it's explicit. You can implement (and people have implemented) dynamic switching between stack and heap storage in a Rust library.

https://lib.rs/smallvec (stack to heap), https://lib.rs/tinyvec (smallvec with no unsafe code), https://lib.rs/arrayvec (stack only)

As you can see, these three library authors get to control very precisely how their types allocate and deallocate, and you basically mix and match these and the stdlib's smart pointers (and Vec) + other libraries like arenas, slot maps, etc to allocate the way you want.

> you'd probably achieve that type of resource control by exposing only WITH- macros in your API*

Yes, this and similarly using with_* closures both work, but both are more limited than destructors that run when something goes out of scope. A type that implements Drop can be stored in any other type, and the wrapper will automatically drop it. You can put LockGuard in a struct and build an abstraction around it.

I feel that this is one of those common misconceptions about Rust. Rust's memory management is nothing like C or non-modern C++'s with malloc/free or new/delete. Rust uses modern-C++'s RAII model, typically, to allocate memory. The compiler is smart enough to know when to call drop() (which is essentially free/delete, but with the possibility of additional behavior). You can also call drop() yourself.

What I think people _should_ focus on with Rust versus Go (et al) is that Rust allows you to choose where you _place_ memory. You can choose the stack or the heap. The placement can matter in hot regions of code. Additionally, Rust is pretty in-your-face when it comes to concurrency and sharing memory across thread/task boundaries.

Tangentially, I did a bit of Rust work recently. I was sadly unable to find a concise credible answer to a rather elementary best-practices question: How does ownership interact with nested datastructures? Is it possible to build a heap tree without Boxing every node explicitly?
This question is a bit subtle, it depends on exactly what you mean. You could make a tree using only borrow checked references and the compiler would make sure that parent nodes go out of scope at the same time or before the child nodes they point to, but I don't think that's what you're talking about.

In general, if it's a datastructure where you have to use pointers, you'll have them Box'ed, but you would try to avoid that if you can. In your example of a heap, you'd want to use an array-based implementation, probably backed by a growable Vec, and use indexes internally. A peek function would still return a normal Rust reference to the data, and the borrow checker would make sure that you don't mutate the heap's backing array while that reference was still in use, etc.

I never thought about using a Vec for these, but that is a great idea for keeping the memory management sane for tree/linked lists.

One thing I would add that you need to be wary of destructors with large pointer data structures in Rust since it can easily stack overflow. When using Option<Box<T>> you need to be careful to call Option::take on the pointers in a loop to avoid stack overflow.

You'd do the same stuff you'd do in C++ here; allocate every node explicitly, use an arena, whatever you want.
Thanks. Saw that before, but the credibility/length ratio wasn't high enough to read it more carefully. It appears that we do have to Box/Rc/Arc nodes in a recursive datastructure. Doable, but a bit on the inconvenient side.

    struct Node {
        elem: i32,
        next: Option<Box<Node>>,
    }
All explanations start with why without any indirection, Node would be a recursive, infinitely large type. Therefore the Node must be a pointer. Ok. But then, Rust then forces you to answer this question: who will own the data referred to by the pointer? Consequently, who will be responsible for freeing it?

If you use a &mut Node as your pointer, you are attempting to answer those questions with "not me". Someone else has to own the Nodes. They've got to be somewhere on the stack or on the heap. There's nothing stopping you from defining the next pointer as Option<&'a mut Node>. The problem is actually constructing a list.

Not many answers tell you why you can't do this in practice. I agree that this is not explained well enough in general, because new Rustaceans don't intuitively reach for references so it probably doesn't come up much and they're hard to use for this. But it's not that hard to see why:

Imagine you try to allocate all the Nodes at once (e.g. an array), and then use &mut references into the pre-allocated array. In order to set one &mut Node's next pointer, you will have to hold another &mut Node to set it to. This means you need to acquire mutable references to array elements, in the order that you wish them to appear in the linked list. This is actually really tricky to do: slice::get_mut(index)'s returned reference borrows the entire slice, so it doesn't let you have a &mut reference to two nodes at the same time. You need smaller &mut [Node] slices, somehow.

slice::split_first_mut is one way (in order), but if you have an array and can only create a linked list in the order nodes appear in the array, what's the point? Just use the array! Any other compiler-checked access order scheme will also be so limiting that you should just use a data structure of that exact shape anyway. To use an arbitrary order, you're going to need unsafe, so you'd basically be writing C.

To be fair, there is basically one application of this, and it's to have a sparsely populated constant size array that needs to be iterated in order. I made a demo:

https://play.rust-lang.org/?version=stable&mode=debug&editio...

The other problem is that you can't resize the backing array: your &mut Node references would be invalidated.

For this reason, pre-allocated lists like these are usually done with indices instead of references. The overhead is one pointer + offset and then a bounds check when dereferencing.

---

The other solutions answer the ownership question like so:

- You can use Box, so that each Node (acting as a list head) owns the entire tail of the list, uniquely, such that no other list can also refer to it. The tail is freed when the head is, unless of course you detach it first (let tail = node.next.take();).

- You can Arc/Rc the nodes, so that each node has a pointer, but not a unique pointer, to the next node. These can be duplicated, so lists can exhibit structural sharing if you are comfortable with that. Because of the sharing, freeing the head does not necessarily free any/all of the tail.

An improvement, using the Node struct and apparently pushing the limits of borrowck: https://play.rust-lang.org/?version=stable&mode=debug&editio...

If you actually want this intrusive linked list functionality, consider using a real-world implementation like this: https://lib.rs/intrusive_collections

Thanks for the in-depth dive. My usecase is doing transformations over an AST: trees are immutable, but may become shared or dead deep in the middle of some complex transformation. Probably Rc<Node> is the reasonable approach, as Box is too constraining.
Of course. Arena allocation comes to mind.
> Additionally, Rust is pretty in-your-face when it comes to concurrency and sharing memory across thread/task boundaries.

Use channels whenever possible.

Channels are not always the best solution (unless you're referring to Rust channels?)

https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s...

Yeah, Rust's crossbeam channels are actually really good.
It kills me that RAII is considered modern c++. It's there since 1983 aha, what do you think fstream and std::vector are if not RAII wrappers over files or memory
I think before the introduction of move semantics in C++11, there were a lot of cases where you needed new and delete to get basic things working. (Moving an fstream around is a relevant example.) So the modern rule of "don't use new and delete in application code" really wasn't practical before that.
No, pretty much everything could be done with swap (like moving an fstream as you say). Sure, it's a bit more cumbersome, but it was still RAII.
I suppose RAII is an old concept, but move semantics allowing RAII to transfer ownership and avoid manual new/free of non-copied resources was uncommon until C++11.
before unique_ptr we didn't have a good way to handle raii for a lot of things. I wrote a lot of RAII wrappers for various things (still do, but a lot less). Attempts like auto_ptr show just how hard it is to make raii work well before C++11.

Yes we had RAII, but it didn't work for a lot of cases where we needed it.

Go = you do no explicit memory management and the GC/runtime takes care of it for you

Rust = when writing your code, you explicitly describe the ownership and lifetime of your objects and how your functions are allowed to consume/copy etc. them and get safety as a result

C = when writing your code, you explicitly allocate and free your objects and you get no assistance from the language about when it is safe to copy/dereference/free/etc. a pointer/allocation

I prefer to think that in Go you don't do explicit memory management by default, while in Rust you do. Although you can laboriously opt out of explicit memory management (e.g., by tagging everything Rc<> or Gc<> and all of the ceremony that entails).
> Isn't part of the point of Rust that you don't manage memory yourself, and rather that the compiler is smart enough to manage it for you?

For trivial cases, kind of. But once you start to do anything remotely sophisticated, no. Everything you do in Rust is checked w.r.t. memory management, but you still need to make many choices about it. All the stuff about lifetimes, borrowing, etc: that's memory management. The compiler's checking it for you, but you still need to design stuff sanely, with memory management (and the checking thereof) in mind. It's easy to back yourself into a corner if you ignore this.

While some commenters have pointed out that you still need to deal with lifetimes/thinking about where stuff lives, in practice you can avoid almost all of this by using Rc<Type> instead of Type everywhere (or Arc in a multithreaded scenario).

Yes Rc and equivalents have a performance overhead, but for many use cases the overhead really isn't that bad since you typically aren't creating tons of copies. In practice, I've found one can ignore lifetimes in almost all cases even when using references except when storing them in structs or closures. So really you would just need to increment the Rc counter for structs/closures outside of allocation/deallocation which is dominated by calls to malloc/free.

I've tried this before and it was so laborious that I regretted it. I'm not sure I saved myself any time over writing "vanilla" Rust or whatever one might call the default alternative. If I was really interested in writing Rust more quickly, I would just clone everything rather than Rc it, but in whichever case you're still moving quite a lot slower than you would in Go.
I've tried writing Rc-oriented Rust (for gtk-rs) too, and struggled hard with the pervasive cloning/aliasing needed, having to use weak references to avoid leaking memory, and the clone!() macro turning off rustfmt for all code in the method body. In fact, I'd rather deal with Qt-style memory management, with single QObject ownership, QPointer (which is kinda like a weak pointer), and praying you don't use-after-free.

(Normally I use subclassing in Qt to associate extra state with a widget, but gtk-rs's subclassing API was arcane and boilerplate-heavy. Perhaps there's alternative paradigms for state management that follows Rust's single ownership principle better. Some people take a React/Elm-style approach, but I don't think virtual DOMs and diffing the entire UI tree on each user interaction are the last word on GUI interactivity and updates, and I don't find the added memory of virtual DOMs and CPU of generating/diffing them acceptable, but rather "pure overhead" to be eliminated in favor of minimal targeted UI state updates.)

You can also kind of do your own management of memory in GC languages, you just have to be extremely careful in code review to spot inadvertant allocations in the hot path. A great example is the "LMAX Disruptor" in Java: https://lmax-exchange.github.io/disruptor/

The trick is to pre-allocate all your objects and buffers and reuse them in a ring buffer. Similar techniques work in zero-malloc embedded C environments.

While you may not have to directly call malloc and free in Rust, the memory management still feels very manual compared to a language with GC. When I want to pass an object around I have to decide whether to pass a &_, a Box<_>, Rc<_>, or Rc<RefCell<_>>, or a &Rc<RefCell<_>>, etc. And then there are lifetime parameters, and having to constantly be aware of relative lifetimes of objects. Those are all manual decisions related to memory management that you have to constantly make in Rust that you wouldn't need to think about in Go or Python or Java.

Similarly, idiomatic modern C++ rarely needs new and delete calls, but I'd still say it has manual memory management.

I suppose it's reasonable to talk about degrees of manual-ness, and say that memory management in Rust or modern C++ is less manual than C, but more manual than Go/Python/Java.

There are already a lot of replies to this comment explaining the ideas behind Rust memory management in different ways, but I'll throw in my handwavy explanation as well:

In GC languages, memory management is generally runtime through the interpreter/runtime. In C, memory management is generally done at programming time by the (human) programmer. In Rust, memory management is generally done at compile time by the compiler. There are exceptions in all three cases, but the "default" paradigm of a language informs a lot about how it's designed and used.

You are still managing memory in Rust, it’s just more constrained, statically checked and inferred. Within those constraints you have full control.
I'm not a rust user, but I would argue you are still managing memory manually, you're just doing a lot of it through rust's type system, which can check for errors at compile time, rather than through runtime APIs like the C or C++ standard library. The question then becomes whether it is easier to manage memory through Rust's type system versus via standard runtime APIs.

From what I've read, Rust memory management actually requires more work but provides fantastic safety guarantees. This could mean that rust actually lowers productivity at first, but as the complexity of the code base grows, some of that productivity is restored or even supercedes C/C++ because you spend no time chasing runtime memory bugs.

For some products or projects, the costs of shipping a security flaw caused by a memory bug exploit could be high enough that a drop in productivity from Rust relative to C is still more than justified due to external costs that Rust mitigates.

I think sometimes the "compiler manages memory for you" concept gets overplayed a bit. It's not as complex as that description makes it sound. If you understand C++ destructors, it's really the same thing. Objects get destroyed when they go out of scope, and any memory or other resources they own get freed. The differences come up when you look at what happens when you make a mistake, like holding a pointer to a freed object. (Rust catches these mistakes at compile time, which does indeed involve some new complexity.)
Try to implement a data structure that works across async runtimes, or a couple of GUI widgets, then you will get the point why some of us complain about the borrow checker, even with decades of experience in C and C++.
Or rather, acting as if rust is positioned to replace general purpose languages.
It's very easy to "make it work" while fencing with compiler warnings by just copying things around instead of developing a clear sense of memory ownership. I've seen myself fall into this trap. The upside, coming from C, is that you don't have terrible memory safety issues. The downside is that you have the same data copied all over the place and (accidentally) allocate like a mad man. Managed memory is not inherently bad or good.
I also was confused about that part but for another reason: The whole post is basically "despite go having a GC we had to manually manage the memory to make it work" and then the anti-rewrite is "go does memory management for us". IMO people sometimes have really weird ideas what is and isn't part of managing memory.