Hacker News new | ask | show | jobs
by cwzwarich 3701 days ago
Ironically, it is probably easier to write tooling for verifying C code than it is for verifying Rust code. And the additional benefits of Rust also don't help you that much if you're going to be verifying things; whatever memory safety is provided by Rust would fall out of general correctness.

For C, you just need to come up with a restricted semantics for a subset of C (e.g. you can pick a particular evaluation order, assuming your implementation also respects it). Rust is a much more complex language, with no formal semantics forthcoming, and you would need to verify the semantic properties supposedly enforced by the type system, as well as their interaction with unsafe code. All of this would require new deep research, whereas the equivalent for C-like languages is well-trodden ground by now.

2 comments

> with no forms semantics forthcoming

Actually, €5MM grant has been given to some academics to produce a formal model of Rust over the next few years. So we'll see... A big part of it is also formalizing what unsafe means, which is the hard part.

My understanding was that the goal of that research project is to produce tools for proving code written in unsafe Rust as safe at the API boundary (with limitations around FFI and inline assembly, of course).
I found two:

ftp://ftp.cs.washington.edu/tr/2015/03/UW-CSE-15-03-02.pdf

http://plv.mpi-sws.org/rustbelt/#project

The first formalized the safety-critical parts of the language to show them sound when used with safety on. The second will try to do that with unsafe. Nobody seems to be working on a full, formal semantics. That's disturbing given how many problems they tend to catch. So, SPARK/Ada combo still leads in safety for now.

Gave myself the idea that maybe someone can formalize a super/subset of Rust much like SPARK for Ada. Code most critical stuff in it with it just becoming functions other Rust code called. My original idea is possible at that point given complexity is minimized.

EDIT: Just finished skimming the first then reading conclusion in detail. It's apparently not Rust's model itself so much as a similar one named Patina that still has to map to the real thing. That's when the problems will show up. Also, Rust type system was updated several times while they were doing all that.

The fundamental problem is that Rust is actually two languages that interact with each other. There is the high-level language with a region system, traits, etc. as well as the low-level language used for unsafe code, which exposes most of the functionality and memory model of LLVM IR (which is very C-like).

Formalizing the high-level language would have some difficulties, e.g. the trait system has a lot of ad-hoc complexity, and there are convenience features like automatic destructors that improve programmer productivity but greatly complicate the definition of things like evaluation order (which is currently undefined for Rust). These things make the specification more complex, but they appear to be well within the limits of current programming language technology. For example, the safe subset of Rust could possibly be translated into a relatively standard type system with linear regions, e.g. http://www.cs.cornell.edu/people/fluet/research/substruct-re....

The difficulty comes at the semantic boundary, where the state-of-the-art in modeling the interaction between distinct languages is still far away from the safe / unsafe portions of Rust. This is an interesting research project, and fits in well with a general trend of cross-language interaction that appears in current research on programming language formalization. However, because it requires formalizing something that includes a C-like language, it's hard to imagine it ever being any simpler than formalizing a C-like language. If you're a user of unsafe code and just want to prove that the unsafe code is "memory-safe" (a somewhat nebulous term), then this helps you out. But if you're going to prove that your entire program is correct, why bother with all of the extra complexity?

Thanks for insightful reply. The difficulties make more sense now. Far as why at the end, the guarantees formal verification give you can far exceed basic, memory safety for common usage. That's why for many verification projects.
To clarify: I was rhetorically asking why one would bother with the complexity of completely verifying a program in Rust rather than C, not why one would want full verification rather than memory safety.
They say it a few different ways on their site. Regardless, it's research, so we'll see if they even figure it out! I hope so, though.
"Rust is a much more complex language, with no formal semantics forthcoming" "All of this would require new deep research"

That's what my Googling showed me. Good call. SPARK, C subsets, and embedded Java still winning in this area.