Hacker News new | ask | show | jobs
by steveklabnik 3701 days ago
> with no forms semantics forthcoming

Actually, €5MM grant has been given to some academics to produce a formal model of Rust over the next few years. So we'll see... A big part of it is also formalizing what unsafe means, which is the hard part.

1 comments

My understanding was that the goal of that research project is to produce tools for proving code written in unsafe Rust as safe at the API boundary (with limitations around FFI and inline assembly, of course).
I found two:

ftp://ftp.cs.washington.edu/tr/2015/03/UW-CSE-15-03-02.pdf

http://plv.mpi-sws.org/rustbelt/#project

The first formalized the safety-critical parts of the language to show them sound when used with safety on. The second will try to do that with unsafe. Nobody seems to be working on a full, formal semantics. That's disturbing given how many problems they tend to catch. So, SPARK/Ada combo still leads in safety for now.

Gave myself the idea that maybe someone can formalize a super/subset of Rust much like SPARK for Ada. Code most critical stuff in it with it just becoming functions other Rust code called. My original idea is possible at that point given complexity is minimized.

EDIT: Just finished skimming the first then reading conclusion in detail. It's apparently not Rust's model itself so much as a similar one named Patina that still has to map to the real thing. That's when the problems will show up. Also, Rust type system was updated several times while they were doing all that.

The fundamental problem is that Rust is actually two languages that interact with each other. There is the high-level language with a region system, traits, etc. as well as the low-level language used for unsafe code, which exposes most of the functionality and memory model of LLVM IR (which is very C-like).

Formalizing the high-level language would have some difficulties, e.g. the trait system has a lot of ad-hoc complexity, and there are convenience features like automatic destructors that improve programmer productivity but greatly complicate the definition of things like evaluation order (which is currently undefined for Rust). These things make the specification more complex, but they appear to be well within the limits of current programming language technology. For example, the safe subset of Rust could possibly be translated into a relatively standard type system with linear regions, e.g. http://www.cs.cornell.edu/people/fluet/research/substruct-re....

The difficulty comes at the semantic boundary, where the state-of-the-art in modeling the interaction between distinct languages is still far away from the safe / unsafe portions of Rust. This is an interesting research project, and fits in well with a general trend of cross-language interaction that appears in current research on programming language formalization. However, because it requires formalizing something that includes a C-like language, it's hard to imagine it ever being any simpler than formalizing a C-like language. If you're a user of unsafe code and just want to prove that the unsafe code is "memory-safe" (a somewhat nebulous term), then this helps you out. But if you're going to prove that your entire program is correct, why bother with all of the extra complexity?

Thanks for insightful reply. The difficulties make more sense now. Far as why at the end, the guarantees formal verification give you can far exceed basic, memory safety for common usage. That's why for many verification projects.
To clarify: I was rhetorically asking why one would bother with the complexity of completely verifying a program in Rust rather than C, not why one would want full verification rather than memory safety.
(a) The program had to be written in Rust as part of a whole application or for seemless interoperability

(b) You wanted to avoid problems of C. Opposed to it in general with Rust an option.

(c) Your hardware architecture easily supported Rust's safe subset but not C's model. Examples include Burroughs, Ten15, Java CPU, and SAFE (crash-safe.org) CPU's. Edit to say may as I'm not a Rust programmer. C being very unconstrained breaks most of these models.

(d) Others have proven precedents but this is speculative: doing a split between formal verification and type system for safety. A number of projects have done this. A simplifying example is that you model check or verify an abstract design that you code a Rust equivalent to. The semantics helps with the abstraction that knocks out A, B, and C classes of problems. The proven type system of Rust (or SPARK or whatever) is leveraged for X, Y, and Z problem classes. Each method is applied where it's strong.

These are best a quick brainstorm can do at this hour for me. :)

They say it a few different ways on their site. Regardless, it's research, so we'll see if they even figure it out! I hope so, though.