Hacker News new | ask | show | jobs
by clarus 770 days ago
Thanks for the comment! One of the authors here.

Indeed this would be a nice process to verify coq-of-rust. Also, although the code is rather short, we depend on the Rust compiler to parse and type-check the input Rust code. So that would need to be also verified, or at least formally specified without doing the proofs, and the API of rustc is rather large and unstable. It could still be a way to get more insurance.

3 comments

I didn't touch on that, but I did assume trust of the Rust toolchain, verifying starting at THIR. Verifying rustc would be a monumental undertaking, though I think some people are working on it.

Since we don't have a verified rustc (a la CompCert [0]), I wonder if an approach like the translation validation of seL4 [1] would work. They prove that the artifact (ARM machine code) produced by an existing compiler (gcc) for a chosen program (seL4) matches the source semantics (C). Thus you could circumvent trusting rustc, but it only works to verify a specific output of a chosen program. If the chosen program was coq-of-rust, I don't think this would be easier than the approach I detailed above. The seL4 kernel is 9,500 lines of C, while their Isabel/HOL specification is over 200,000 lines, so the technique doesn't seem to scale to a large chosen source like rustc.

Isn't bootstrapping fun?

[0]: Xavier Leroy. 2008. “Formal verification of a realistic compiler”. https://xavierleroy.org/publi/compcert-CACM.pdf

[1]: Thomas Sewell, Magnus Myreen, and Gerwin Klein. PLDI 2013. “Translation Validation for a Verified OS Kernel”. https://sci-hub.st/10.1145/2491956.2462183

Since you contrast the two approaches you might be interested in learning that CompCert also uses translation validation in some part of the compiler (notably for register allocation), see Jean-Baptiste Tristan's papers with Xavier.
Is there a Coq formalisation for enough of Rust that would be usable here? I thought people were still figuring that out.
The formalization work for Rust was done mostly at the MIR level, which is one step lower than the THIR level we use here. See, for example, the https://plv.mpi-sws.org/rustbelt/ project. MIR should be more amenable to formal specification, as this language is more compact than THIR and aims to be more stable.

However, we also lose some information going to MIR, as there are no expressions/loops from the original code anymore. There are still ways to reconstruct these, but we preferred to use the THIR representation directly.

That makes sense and fits what I had in mind: there is no formalization at the source level. Thanks for the details!
Very interesting work! I'm curious how you handle loops/recursion? I imagine the `M` monad seen in the examples has a special primitive for loops?
Yes there is a special primitive for loops in the monad. The primitives are uninterpreted, we define valid finite traces of execution of a program, and reason about these traces.

If a program has a loop we show that it terminates by constructing an execution trace. Note that we do not yet consider concurrency, so programs are deterministic.