Hacker News new | ask | show | jobs
by ajross 1886 days ago
That's where I always end up when I start playing with Rust too. At this point I genuinely think I've given up on belief that the "borrow checker" paradigm is ever really going to be broadly successful (vs. "screw it, just do it in <other managed language>").

Problems with well-constrained allocation paradigms do really well in rust. Problems with reading/parsing/storing messy data structures from external data just really, really hurt.

I think a big part of the problem is that the ownership analysis layer has the dual crises of being (1) really complicated and too hard to keep in your head as a single model and (2) specified in an ad-hoc way that resists formal rules. So eventually you end up in of those puzzles like you mention where... there's just no answer! No one's been where you have, and the lack of rigor means you end up trying to reverse engineer the compiler front end trying to find the trick that works. And then you give up and write it in Go or Python or whatever.

1 comments

No, borrow checker's formal rules are rather simple. Here it is:

It is an error to access a place, when an access conflicts with a loan, and the loan is live.

You're describing the desired behavior of the compiler, but not what it actually implements.

In fact there are an infinite number of correct Rust programs which will never access memory incorrectly but which will still be rejected by the compiler, for the simple reason that Rust's authors, talented though they may be, have made no progress at all at solving the Halting Problem.

What Rust actually accepts is a subset of correct programs. And this subset is somewhat informally defined as "whatever the compiler could manage to prove". Real code hits against this limit occasionally, and when it does there's really no option other than "join the Rust team" or "try voodoo".

There are two problems here. Rust accepts a subset of correct programs that involve borrowing, that's absolutely true, but it gives you the tools to get around that: you either use Arc<Mutex<T>, where now you are in the same space that any refcounted GC language is in (with the extra flexibility that if you can "choose your guarantees"[1]), or use unsafe which lets you dereference memory freely in the same way that you would in C or C++.

The unstated problem is that there are patterns from other languages that are actually invalid and Rust is correctly denying, but not properly communicating it, which causes frustration. You can always make the argument that the failure modes are benign or rare, and for those cases noone is stopping you from actually using unsafe.

[1]: https://manishearth.github.io/blog/2015/05/27/wrapper-types-...

> The unstated problem is that there are patterns from other languages that are actually invalid

And this is why this argument persists. This is, to people outside the community, a semantic evasion:

1. It relies on a Rust-internal definition for "actually valid" (conforming to Rust's specific set of provability requirements) that doesn't correspond to what the rest of the world views as "correct" (not behaving incorrectly). Think about stuff like allocate-through-a-session-and-free-in-a-block paradigms (Apache was famous for this), or run-once-and-exit, or garbage collection, etc... Those things aren't "invalid" in any reasonable sense, they're just not what Rust programs do.

2. The definition for "valid" is (and this is my point above) entirely Rust-internal and ad hoc. It's not that we refuse to conform to your rules, really, it's that we don't know what they are!

There's an anecdote about a similar effect in the article too:

> Fun fact: while at Mozilla I heard multiple anecdotes of [very intelligent] Firefox developers thinking they had found a bug in Rust's borrow checker because they thought it was impossible for a flagged error to occur." However, after sufficient investigation the result was always (maybe with an exception or two because Mozilla adopted Rust very early) that the Rust compiler was correct and the developer's assertions about how code could behave was incorrect. In these cases, the Rust compiler likely prevented hard-to-debug bugs or even exploitable security vulnerabilities. I remember one developer exclaiming that if the bug had shipped, it would have taken weeks to debug and would likely have gone unfixed for years unless its severity warranted staffing.

Sometimes people think that they're right and that the compiler is wrong, but often, (not not necessarily always!) it's that they forgot some important piece of context.

(Oh, and Rust absolutely can do "allocate in a block and free at once" stuff, or "run once and exit" stuff...)

Rules are actually simple and not ad hoc. I in fact think Rust documentation should state rules and make them more prominent. In my experience, knowing the rules does not help you to keep them, but at least we will get less complaints of the sort "we don't know what the rules are".

To recap, here are the rules:

It is an error to access a place, when an access conflicts with a loan, and the loan is live. Access means source code construct to read or write, whether the construct is actually executed in runtime is irrelevant. Place is static approximation of a set of bytes in the memory. Place is either local (x), field (x.f), or upvar (local captured in closure). Indexing (x[i]) and method calls (x.m()) are approximated as local (x). Loan is either shared (&x) or mutable (&mut x). Read access conflicts with mutable loan, write access conflicts with all loans. Expression is live from the definition to the end of the containing statement. Declaration is live in connected region of control flow graph from the definition to potentially multiple last uses.

No, the rule I stated is what is actually implemented.

Like all type checking technology, the rule is about static behavior, not dynamic behavior. So halting problem is irrelevant.

Rust accepts all programs that follow the rule in static behavior. It's not a subset, and it's not a best effort to prove. It's the complete statement.

"Access" means source code construct to read or write. Rust doesn't care whether the construct is actually executed in runtime.

I believe the "subset of correct code" refers to things like the current non-support for things like

    let (left, right) = (&mut foo[..split], &mut foo[split..]);
where you have to rely on things like foo.split_at_mut(split), which are implemented as unsafe under the cover.

I see these are limitations but not show-stoppers in any way.

It's surely not a showstopper when it's on one line, but again the problem in practice is that those slices may be other operations or hidden behind other APIs, and happening at different depths in a complicated call tree, and you get an error only up at the top when what you're doing looks "clearly correct".

Yes, sure, the rule in question may be simple in the abstract (it's a compiler, after all -- they're just software doing straightforward things). But the ability for the poor programmer to detect which rule is being violated where is a lot more limited than the compiler is designed for. Thus the user with the upthread complaint, which is hardly unique.

I mean, at the end of the day if Rust wants to be an everyday language for this kind of everyday problem, the analysis paradigm needs to be communicated much better to everyday hackers (via docs, error messages, whatever). When Rust was new this seemed like just a technical problem to be solved with software maturity. I guess at this point after several years of regularly returning to play with Rust and being frustrated every time, I've mostly given up.

Sure, but you claimed there is no simple formal rule. That's simply false. It's just that the formal statement is not front and center in documentations, because it is not very useful to learn the rule, and most documentations are targeted to learners.
Yes, but the rule is still complete. What is going on is "place" is currently defined to be local x, or field x.f, or index x[i], but all x[i] are merged to x[*]. Rust's so-called NLL changed definition of "live". The point is, the rule is simple, formal, and complete. It is neither an ad hoc rule nor a best effort prover.
> What is going on is "place" is currently defined to be

Not to belabor this growing thread, but here's the disconnect. I complained the rules were ad hoc and complicated, you replied that they were simple, and when challenged on an edge case your treatment is to add another clause to re-define a term you used in isolation earlier.

That's what "ad hoc and complicated" means.

Link to the relevant spec then? Admittedly my intuition is a year stale at the moment, but I went looking hard for this at one point and found nothing other than source code.
No spec, because technical details are complex, but spec, when it is written, will just be an elaboration of "It is an error to access a place, when an access conflicts with a loan, and the loan is live". This has never changed since Rust 1.0, more than 5 years. Spec will just need to define "place" and "live" and that takes time.
There's no spec yet, but there are working groups working on it.