Hacker News new | ask | show | jobs
by millstone 2615 days ago
> Rust enforces single mutable ownership or multiple readonly aliases at a time. In fact, they are very good idioms to structure large codebase anyways, and normally they do not get in the way for ordinary applications.

No these limitations routinely get in the way for ordinary applications. The borrow checker is a source of frustration when ramping. Back-references get smuggled in as array indexes. Prohibiting global variables is tough. Any sort of app that can't be structured as a tree is going to have pain.

This safety is really valuable, but let's not pretend it comes for free.

2 comments

The author is somewhat mistaken there - what Rust actually enforces is a clear alternative of exclusive ownership/borrowing, or shared access with multiple aliases being active at the same time. While these are normally identified with "mutable" vs. "readonly" access, this is not true in some cases, where special structures with "interior mutability" can be provided with different behavior. For example, if you need to share writable access to a piece of data, you can use the "Cell<>" or "RefCell<>" generic types. For an object which needs to have multiple "owners", each of which can extend its lifetime and prevent the object from being freed, there is the Rc<> type, etc. This stuff may not come for free, but quite often its cost can be made very reasonable while preserving desirable safety properties.
The performance cost of interior mutability is often small. IMO the real cost is the undesirable safety properties.

For example consider an API like JS's getElementById(). In Rust, if a caller frame has a reference to the same element, this would just panic. It's impossible to statically enforce that no caller can have a reference to this element, and it's unreasonable to require it at runtime. So you either give up safety guarantees (viable, e.g. gtk-rs, but it leaves Rust anemic) or you give up the entire programming model (maybe viable, still a research project).

getElementById() is perfectly allowable under Rust's rules, though it may not look like what you expect (and to be fair Rust doesn't make it easy to write today).

The most straightforward way to get it working would be to return a `&Element` (or `Rc<Element>` if you like), and make all of `Element`'s fields `Cell`s. No panics, no runtime checks, and you can do everything JS can do. The cost is "infecting" the type definitions with `Cell<T>` and the usage with `.get`/`.set`, and the loss of what is normally rich pointer aliasing information for the optimizer (but which other languages don't have to begin with).

The reason Rust `&mut` feels so restrictive is that it allows you to change the "shape" of the object, thus invalidating (if they existed) any other references into it- replace an enum value with a new variant, reallocate a `Vec`, overwrite a `Box`, etc. But in other memory-safe languages you can't do any of those things. Instead any "shape changing" is done by allocating a new GC'd value and overwriting a pointer- enums are boxed (or don't allow interior pointers), arrays only contain primitives or pointers, etc.

So I like to think of `&Cell<T>` as a kind of third reference mode, that matches what people expect from other languages. It's not fun to use today, but there are a couple of language additions that could make it much, much nicer:

* First, field projection- given a type `struct S { x: T }` and a value `r: &Cell<S>`, let `r.x: Cell<T>`. This is safe as you can't invalidate `&r.x` by overwriting `r`- but by extension you can't project a reference through a `Cell` into an enum or `Vec` (just like other languages as described above).

* Second, some syntactic sugar for reading and writing. Replace `cell.get()` and `cell.set(x)` with `cell` and `cell = x`. Given that `Cell` has zero overhead (other than the loss of optimizations described above) this shouldn't be an issue.

The more idiomatic way would be to return an Rc<RefCell<Element>>.

Accessing it will indeed panic if someone else has an active mutable reference to the contents of the refcell, but the idea is that you should only keep that for the section of code that actually modifies the object.

The reason this feature exists is that it prevents code observing partially modified objects that could be temporarily missing required invariants (in addition to preventing mutating object with references to parts of them which can result in dangling pointers and thus memory unsafety).

You can also use Cell as the parent argues with the benefit of never failing, but then you don't get protection from recursive calls exposing violated invariants and you need to change the implementation of Element itself.

To me it was the opposite: it gave me a vocabulary and taught me how to think about these problems.

Shared mutable state and ownership exists in C, but I just don't get any compiler support for it. I can't even document it in code, so I (and users of my libraries) rely on RTFM.

In C I'd just "wing it", and tweak the code until it stops crashing. Maybe add a flag with "obj.free_data_ptr = true" and keep adding mutexes or copies of data where I suspect it's necessary.

In Rust I get predefined templates for this — owns & borrows, cells/atomics, refcouted and mutex containers, etc. The compiler says "nope, this is wrong!" and I get to conciously decide how to solve it — do I share or copy the data? Is the sharing dynamic, or just in a wrong scope? And my decisions are documented in code, and enforced by the compiler.