Hacker News new | ask | show | jobs
by Spartan-S63 1729 days ago
I feel that this is one of those common misconceptions about Rust. Rust's memory management is nothing like C or non-modern C++'s with malloc/free or new/delete. Rust uses modern-C++'s RAII model, typically, to allocate memory. The compiler is smart enough to know when to call drop() (which is essentially free/delete, but with the possibility of additional behavior). You can also call drop() yourself.

What I think people _should_ focus on with Rust versus Go (et al) is that Rust allows you to choose where you _place_ memory. You can choose the stack or the heap. The placement can matter in hot regions of code. Additionally, Rust is pretty in-your-face when it comes to concurrency and sharing memory across thread/task boundaries.

3 comments

Tangentially, I did a bit of Rust work recently. I was sadly unable to find a concise credible answer to a rather elementary best-practices question: How does ownership interact with nested datastructures? Is it possible to build a heap tree without Boxing every node explicitly?
This question is a bit subtle, it depends on exactly what you mean. You could make a tree using only borrow checked references and the compiler would make sure that parent nodes go out of scope at the same time or before the child nodes they point to, but I don't think that's what you're talking about.

In general, if it's a datastructure where you have to use pointers, you'll have them Box'ed, but you would try to avoid that if you can. In your example of a heap, you'd want to use an array-based implementation, probably backed by a growable Vec, and use indexes internally. A peek function would still return a normal Rust reference to the data, and the borrow checker would make sure that you don't mutate the heap's backing array while that reference was still in use, etc.

I never thought about using a Vec for these, but that is a great idea for keeping the memory management sane for tree/linked lists.

One thing I would add that you need to be wary of destructors with large pointer data structures in Rust since it can easily stack overflow. When using Option<Box<T>> you need to be careful to call Option::take on the pointers in a loop to avoid stack overflow.

You'd do the same stuff you'd do in C++ here; allocate every node explicitly, use an arena, whatever you want.
Thanks. Saw that before, but the credibility/length ratio wasn't high enough to read it more carefully. It appears that we do have to Box/Rc/Arc nodes in a recursive datastructure. Doable, but a bit on the inconvenient side.

    struct Node {
        elem: i32,
        next: Option<Box<Node>>,
    }
All explanations start with why without any indirection, Node would be a recursive, infinitely large type. Therefore the Node must be a pointer. Ok. But then, Rust then forces you to answer this question: who will own the data referred to by the pointer? Consequently, who will be responsible for freeing it?

If you use a &mut Node as your pointer, you are attempting to answer those questions with "not me". Someone else has to own the Nodes. They've got to be somewhere on the stack or on the heap. There's nothing stopping you from defining the next pointer as Option<&'a mut Node>. The problem is actually constructing a list.

Not many answers tell you why you can't do this in practice. I agree that this is not explained well enough in general, because new Rustaceans don't intuitively reach for references so it probably doesn't come up much and they're hard to use for this. But it's not that hard to see why:

Imagine you try to allocate all the Nodes at once (e.g. an array), and then use &mut references into the pre-allocated array. In order to set one &mut Node's next pointer, you will have to hold another &mut Node to set it to. This means you need to acquire mutable references to array elements, in the order that you wish them to appear in the linked list. This is actually really tricky to do: slice::get_mut(index)'s returned reference borrows the entire slice, so it doesn't let you have a &mut reference to two nodes at the same time. You need smaller &mut [Node] slices, somehow.

slice::split_first_mut is one way (in order), but if you have an array and can only create a linked list in the order nodes appear in the array, what's the point? Just use the array! Any other compiler-checked access order scheme will also be so limiting that you should just use a data structure of that exact shape anyway. To use an arbitrary order, you're going to need unsafe, so you'd basically be writing C.

To be fair, there is basically one application of this, and it's to have a sparsely populated constant size array that needs to be iterated in order. I made a demo:

https://play.rust-lang.org/?version=stable&mode=debug&editio...

The other problem is that you can't resize the backing array: your &mut Node references would be invalidated.

For this reason, pre-allocated lists like these are usually done with indices instead of references. The overhead is one pointer + offset and then a bounds check when dereferencing.

---

The other solutions answer the ownership question like so:

- You can use Box, so that each Node (acting as a list head) owns the entire tail of the list, uniquely, such that no other list can also refer to it. The tail is freed when the head is, unless of course you detach it first (let tail = node.next.take();).

- You can Arc/Rc the nodes, so that each node has a pointer, but not a unique pointer, to the next node. These can be duplicated, so lists can exhibit structural sharing if you are comfortable with that. Because of the sharing, freeing the head does not necessarily free any/all of the tail.

An improvement, using the Node struct and apparently pushing the limits of borrowck: https://play.rust-lang.org/?version=stable&mode=debug&editio...

If you actually want this intrusive linked list functionality, consider using a real-world implementation like this: https://lib.rs/intrusive_collections

Thanks for the in-depth dive. My usecase is doing transformations over an AST: trees are immutable, but may become shared or dead deep in the middle of some complex transformation. Probably Rc<Node> is the reasonable approach, as Box is too constraining.
Of course. Arena allocation comes to mind.
> Additionally, Rust is pretty in-your-face when it comes to concurrency and sharing memory across thread/task boundaries.

Use channels whenever possible.

Channels are not always the best solution (unless you're referring to Rust channels?)

https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s...

Yeah, Rust's crossbeam channels are actually really good.
It kills me that RAII is considered modern c++. It's there since 1983 aha, what do you think fstream and std::vector are if not RAII wrappers over files or memory
I think before the introduction of move semantics in C++11, there were a lot of cases where you needed new and delete to get basic things working. (Moving an fstream around is a relevant example.) So the modern rule of "don't use new and delete in application code" really wasn't practical before that.
No, pretty much everything could be done with swap (like moving an fstream as you say). Sure, it's a bit more cumbersome, but it was still RAII.
I suppose RAII is an old concept, but move semantics allowing RAII to transfer ownership and avoid manual new/free of non-copied resources was uncommon until C++11.
before unique_ptr we didn't have a good way to handle raii for a lot of things. I wrote a lot of RAII wrappers for various things (still do, but a lot less). Attempts like auto_ptr show just how hard it is to make raii work well before C++11.

Yes we had RAII, but it didn't work for a lot of cases where we needed it.