Hacker News new | ask | show | jobs
by cwzwarich 3332 days ago
A lot of the awkwardness that the author describes comes from destructors, which Rust has taken from C++. In fact, Rust has even inherited the incoherence between destructors and exceptions from C++, due to the lack of a solution to the double-throw problem and the need to write unsafe code that is correct in the face of unwinding.

The 'dropck' pass is one of the corners of the language that has no precedent in a type system that has been proven sound (at least as far as I am aware, someone please correct me if i'm wrong), and it has had a lot of soundness issues in the past.

The fact that destructors have magical powers that the language refuses to bestow on ordinary functions is a bad sign. And destructors are terrible for predictable code: the order in which destructors run for temporary results in a single expression is not even specified by the language, and there are some surprises (https://aochagavia.github.io/blog/exploring-rusts-unspecifie...) that make it harder to write correct unsafe code.

If you were to design a language from the ground up with linear types and no destructors, it would be dramatically simpler than Rust.

5 comments

I agree that dropck is scary and needs more verification before we can have reasonable assurance of its soundness. But you're being unnecessarily reductive here: the tradeoff between control and ergonomics offered by destructors is well-known. If you think Rust is verbose today, imagine a Rust where every value had to be explicitly disposed of in every scope (including temporaries). Graydon was well aware of strictly-linear type systems, and chose to go with destructors for (heh) sound reasons.
It's not like destructors actually remove the complexity. The extra function call you need to add to replace the destructor is already present in your program; it's merely hidden from view. If you are trying to verify the code you have written (either informally or formally) and want to consider all paths through the program, then you need to include the invisible control-flow created by the compiler for destructors. I don't see how it gets any simpler by not being written in your program.
Nobody's talking about semantic simplicity; we're talking about developer ergonomics.
If you take the view that code is read more times than it is written (which is lightly amusing given the topic of this thread), then shouldn't you optimize for the ergonomics of reading and understanding code that is already written rather than writing it? I don't see how Rust's choice is defensible from that viewpoint. If I quizzed actual Rust users (rather than the language developers that are posting here) about the invisible control-flow created by the compiler, I doubt very many of them would get it right.
Code only gets read if it gets written in the first place, which it won't if your fresh new language makes it too hard to bootstrap an ecosystem and a community. One of the foundational premises of this year's ergonomics initiative is that Rust code is still too hard to write; a Rust-like language with linear types instead of destructors would be yet harder to write, and would only improve the ability to reason about code in a small minority of cases. I've been writing Rust code since before it was cool to claim you were doing things "before it was cool", and while I appreciate the cases where people have longed for linear types, I have never needed anything more than what destructors provide (which subverts the very premise of your comment, because, for my uses, enforcing linear types would be harder to both read and write). Nobody on the Rust team is pretending that there aren't tradeoffs.
Usually programmers don't care about the exact order that things are freed in. That's the whole reason why GC's have been so successful in programming languages (not to mention things like ARC, which are still deterministic but obscure).

I get that occasionally it's important to know the exact order, and this is where tighter rules and tooling can help. Rust wasn't designed to be an "everything is as explicit as possible" language. (Neither is C, for that matter, ever since compilers stopped paying attention to the "register" keyword…)

GC lets you write functions that return heap allocated values without worrying about who's responsible for freeing the result. That in turn leads to functional code over procedural, which in turn leads to simpler and easier understood code.

That is, it's less about order and more about bookkeeping. The ergonomics directly affect code quality.

Perhaps this is obvious to everyone else, but it isn't to me. Are you advocating explicitly dropping every variable I create? Like `init; use; drop;` explicitly where right now I essentially do `init; use;` and then there's an implicit drop when we exit scope?
You would be required to drop values that aren't moved elsewhere. Currently Rust has a one-bit reference count which dynamically tracks which values need to be dropped. Linear types would transform this into a static property that the compiler checks.

Part of the pain would come from conditionals:

    if something { 
        func(val1);
    } else {
        func(val2);
    }
This would be disallowed because the liveness of val1 and val2 can't be statically known after the conditional.
I guess it is only for ones with nontrivial destructors.

That makes the "trivialness" of a destructor part of the interface, which is a price. But then from my experience in C++, we pay that price anyway, because to have any sort of assurance about what you are doing, you need some clue about what the destructor does.

Its hard for me to imagine that requiring every value to be explicitly dropped would produce code which is easier to read and understand.
And more specifically developer writing ergonomics.

In terms of understanding how the code will execute or testing your app, the implicit, confusing rules are less "ergonomic".

Is GC less ergonomic than C-style malloc/free? I don't think so, and not only when writing code. It makes things easier all around.

(Yes, GCs require tuning and so forth, and that can be a pain, but so does malloc and free, so that's a wash. In most applications you never need to manually tune a GC.)

Keep in mind today Rust is intended to be a better C++. (For loose values of 'intended'.)

Destructors are a common resource management idiom there (although I guess most C++ programmers couldn't describe the behavior correctly), and there are no other really common idioms for that.

Better the devil you kind of know?

Yeah dropck is a fun little gem, which has received several post-1.0 revisions due to soundness problems.

I agree the system you propose would be simpler in terms of spec and effort, but I don't know about simpler to use. Much like removing mutable references in favour of `foo(X) -> X` would be a simpler type system, but awful to use compared to `foo(&mut X)`.

The simplest system to use is unrestricted references with some form of automatic garbage collection. If you decide that you want a type system that statically tracks resource utilization, why stop halfway at something that only partially solves the problem? I guess you could one-up linear types here and ask for a type system that makes all creation and destruction of information explicit, e.g. https://www.cs.indiana.edu/~sabry/papers/reversible-logic.pd..., but this is probably a more useful route to explore for a hardware design language than a software programming language.
> If you decide that you want a type system that statically tracks resource utilization, why stop halfway at something that only partially solves the problem?

Because there's a tradeoff between ergonomics and static guarantees. A hybrid system (in this case, static tracking of ownership combined with automatic resource destruction) is a valid choice to balance the upsides and downsides of each extreme.

I'm not particularly educated in this issue, but if you're going to use a garbage collector, why not just use a language with a GC that abstracts away memory management completely? Like, what would be the advantages over Java?
The "point of Rust" is to be the simplest zero-runtime-overhead system to use.
Double-exception handling during stack unwinding in C++ is the thing I disagree most with.

And Í talk from experience.

Back in the early 2000, I modified both MS VS runtime and gcc to be able to safely throw from destructors. Note to doubters that Java allows such double-exception cases and gcc already had code back then to deal with the Java case.

The way to do it is to implicitly assume that every destructor ran during stack unwinding has a try/catch surrounding it. This way, a first exception can be thrown from a destructor, but all exceptions that ultimately escape a destructor invoked during unwinding gets eaten. (Note: you can still provide your own explicit try catch in a destructor if you care about it.)

My experience with this tweak was that the horror story people come up with to reject this approach is unfounded. Here are the reasons:

1. The actual case of double-exceptions are very, very rare.

2. In the case that do arise, the second exception is often either a consequence of the first (for example, trying to access a DB were teh first exception was a failure in some DB code) or the exact same (for example running out of memory).

3. In my experience (although, since the 2nd exception is lost, I cannot actively prove this), the first exceptionis the relevant one. This is especially true due to point #2.

4. In my experience, code that care about the exact type of exception is most often either wrong or misguided. This is because such code assumes complete prescient power over what exceptions can be thrown.

5. In my experience, catching exceptions is 99% done in the top-level message or task dispatching, which doesn't care about the type of exceptions or how many occured: you just abort the operation and do some logging.

6. The fact that double exceptions are handled gracefully informs your design, which builds up and reinforce all previous points.

I once had discussion on this in the 90s in comp.lang.std.c++ and comp.lang.c++. People would not listen.

Note: to do it with STL, you do have to add try/catch within destroy() calls within containers to be able to destroy all items.

I agree. The sad thing is that an effect system to ensure that destructors and other unwinding code can't throw new exceptions is quite simple to add to a language. Depending on the language/libraries it might cause a need to expose some new library functions that don't throw exceptions, but it's already a bad sign if you're doing nontrivial failure-prone work in destructors anyways.
Destructors seem to be a pain point for functional programming people. They're inherently imperative; they don't return anything because they have no one to return it to.

The "unsafe code" problem comes mostly from backpointers. If you have a data structure with a doubly linked list, and the forward pointer and backpointer are both pure references and can't be null, no order of destruction is strictly valid. You can't create, destroy, or manipulate a doubly linked list or a tree with backpointers in safe Rust. That's a problem.

Maybe forward pointer/backpointer pairs need to be a language level concept. The compiler needs to know that the forward pointer and the backpointer are in a relationship. The pair needs to be manipulated as a unit. You have to have mutable ownership of both references to manipulate either. The borrow checker and destructor ordering need to understand this.

> They're inherently imperative; they don't return anything because they have no one to return it to.

If your think of your whole program as being a wrapped in an implicit State monad, holding a POSIXProcessState (e.g. exit code, registered signal handlers, file descriptors, etc.), then destructors are (POSIXProcessState -> POSIXProcessState) functions.

> You can't create, destroy, or manipulate a doubly linked list or a tree with backpointers in safe Rust. ... Maybe forward pointer/backpointer pairs need to be a language level concept.

You can define abstractions like this using unsafe code just fine. It doesn't need to be part of the language. (Think about how C++ "smart pointers" work: it's just a library.)

   Destructors seem to be a 
   pain point 
The problem is not so much typing as such (things that don't return anything but terminate -- as destructors do -- can be typed as Unit) but rather to find a good trade-off between expressivity of the language and simplicity of the typing system.

Basically explicit destructors mean the typing system needs to track lifetimes and ownership in some form or shape. There seem to be two main options.

- Simple lifetime/ownership scheme, but then you need a garbage collector anyway, and that it's mostly pointless to have explicit destructors. Just let every variable be cleaned up by the GC makes for a simpler language (under the hood clever escape analysis might be used for stack allocation of variables that don't escape their activation context).

- Avoid a GC, but then you need a complex typing system with unique owners to have any chance at expressivity (and you still need unsafe blocks and reference counting). This is Rust's choice.

Another issue is how consistently to combine destructors with other effects, in particular exceptions.

   Pointer/backpointer pairs 
   need to be a language level 
   concept.
As "JoshTriplett" also suggests, this is certainly an interesting idea, but I don't think a compelling choice has been found yet.
Think of backpointers as a combination of Rust optional pointers and weak pointers, mostly checked at compile time. The basic rule for backpointers is this: If an type instance A contains a backpointer P1, it must either be a None, or a reference to a type instance B which has exactly one reference P2 to A.

Checks required:

- P2 cannot be changed when P1 is not None. (Run-time check; the compiler has to recognize when it is necessary.)

- P1 can only be set to None or B. (Compile-time check)

- P1 must be set to None before B is destroyed. This avoids a dangling pointer. (Compile-time check when possible, otherwise run-time check.)

- Borrow checking must treat a borrow using P1 as a borrow of B.

These simple rules would maintain the invariant for the backpointer. This allows doubly-linked lists without unsafe code. The backpointer is "weak" and doesn't count as ownership. It's basically weak pointers with a count of either 0 or 1.

I'm not saying this can't be done, au contraire! Indeed cost coherent programming idioms can be converted into typed language primitives. But there is a price to pay in terms of typing system complexity.

It's a slippery slope argument: if you add this, why stop there? Especially if you require run-time checks.

If there was a compelling set of operation that preserved the invariants without run-time checks, and it was expressive, i.e. it covered a large number of cases that you'd otherwise had to put into "unsafe" and it didn't ruin type inference ...

> Maybe forward pointer/backpointer pairs need to be a language level concept. The compiler needs to know that the forward pointer and the backpointer are in a relationship. The pair needs to be manipulated as a unit. You have to have mutable ownership of both references to manipulate either. The borrow checker and destructor ordering need to understand this.

There are many more patterns where that came from, and you don't want to teach the compiler about all of them. I don't think there's anything wrong with having a lower-level "unsafe" mechanism to let you use the language itself to build new types of structures that then provide a safe interface.

As a random example, consider the rust "intrusive-collections" crate (https://crates.io/crates/intrusive-collections), which provides the kind of "no extra pointer" linked list where you can embed a list head (or multiple list heads) directly in your structure. I don't think every such crate should have its functionality native in the compiler.

rust "intrusive-collections" crate

People used to code like that, mostly in assembler and sometimes in C. It's not necessary for functionality. It's just an optimization. One that needs to be justified with benchmarks. Also, it's not at all clear that use of that module is safe.

I'm beginning to think there's a cult of l33t unsafe Rust programming, where people who write unsafe code think they're cool. I used to say that the way to cure new programmers of that is to put them on crash dump analysis for a few months. After they've found pointer bugs in other people's code, they'll have a better sense of why pointer safety is important.

> It's not necessary for functionality. It's just an optimization. One that needs to be justified with benchmarks.

I've worked with people who have the benchmarks to back it up; pointer traversals are expensive.

> people who write unsafe code think they're cool

I've tended to find the opposite: most of the Rust programmers I run into treat unsafe code as an occasionally necessary evil, and every time they write it they think about how the landscape could be improved so they wouldn't have had to, or how to encapsulate it in a separate crate with a small surface area.

This is a really interesting thought. In the presence of must-use types, you'd imagine that types with side effecting Drop impls (like files), would want to migrate to eventually being used/consumed by a "close" method. But doing that safely in the presence of panics would require... exception handlers?...
You could use something like Go's 'defer' statement to ensure 'close' is called before the current function returns.
Which already exists in Rust in the form of... wait for it... the Drop trait. It's the same thing.
The difference is that Go's `defer` is explicitly written out in the relevant scope, while Rust's `Drop` is implicit and defined elsewhere.
There's a "scopeguard" crate that uses Drop to make it explicit.