Hacker News new | ask | show | jobs
by johnisgood 252 days ago
How does that work?

Why can't other languages have a "formal verification library"?

1 comments

> How does that work?

Usually taking the IR (MIR) from rustc and translating it to a verifier engine/language, with the help of metadata in the source (attributes) when needed. E.g. Kani, Prusti, Creusot and more.

> Why can't other languages have a "formal verification library"?

I don't think there is a reason that prevents that, and perhaps some have, however it turns out that modelling shared mutability formally is really hard, and therefore the shared xor mutable rule in Rust really helps verification.

Wouldn't C have such a library already if it really was "possible"? You can turn off strict aliasing.
Not a library, but it kinda does: Frama-C will formally verify C code using special comments to write the contracts in, and I hear it can prove more pointer properties than SPARK, although it takes more more effort to maintain a Frama-C codebase due to all the other missing language features.
Does that mean I can use C + Frama-C so I won't have to use Rust (or Ada / SPARK)? :P
Well, I'm not an expert in formal verifications, but there are such libraries, I listed few of them above, you can go and check them. So it is possible.

C doesn't have the shared xor mutable rule - with strict aliasing or without.

I checked them, but I am not convinced and I am not sure why it was brought into this thread.

SPARK has industrial-strength, integrated verification proven in avionics (DO-178C), rail (EN 50128), and automotive (ISO 26262) contexts. Rust's tools are experimental research projects with limited adoption, and they're bolted-on and fundamentally different from SPARK.

SPARK is a designed-for-verification language subset. Ada code is written in SPARK's restricted subset with contracts (Pre, Post, Contract_Cases, loop invariants) as first-class language features. The GNAT compiler understands these natively, and GNATprove performs verification as part of the build process. It's integrated at the language specification level.

Rust's tools retrofit verification onto a language designed for different goals (zero-cost abstractions, memory safety via ownership). They translate existing Rust semantics into verification languages after the fact - architecturally similar to C's Frama-C or VCC (!). The key difference from C is that Rust's type system already guarantees memory safety in safe code, so these tools can focus on functional correctness rather than proving absence of undefined behavior.

Bottom line is that these tools cannot achieve SPARK-level verification for fundamental reasons: `unsafe` blocks create unverifiable boundaries, the trait system and lifetime inference are too complex to model completely, procedural macros generate code that can't be statically verified, interior mutability (`Cell`, `RefCell`) bypasses type system guarantees, and Rust can panic in safe code. Most critically, Rust lacks a formal language specification with mathematical semantics.

SPARK has no escape hatches, so if it compiles in SPARK, the mathematical guarantees hold for the entire program. SPARK's formal semantics map directly to verification conditions. Rust's semantics are informally specified and constantly evolving (async, const generics, GATs). This isn't tooling immaturity though, it's a consequence of language design.

> SPARK has no escape hatches

It does, you can declare a procedure in SPARK but implement it in regular Ada. Ada is SPARK's escape hatch.

This is just as visible in source code as unsafe is in Rust, a procedure body will have SPARK_Mode => Off.

Yes, SPARK provides an explicit escape hatch, but it is meaningfully different from Rust's `unsafe`.

In SPARK: you keep the spec in the verifiable world (a SPARK-visible subprogram declaration) and place the implementation outside the proof boundary (for example in a body or unit compiled with `pragma SPARK_Mode(Off)`). The body is visible in source but intentionally opaque to the prover; callers are still proved against the spec and must rely on that contract.

In Rust: `unsafe` is a lexical, language-level scope or function attribute that disables certain compiler/borrow checks for the code inside the block or function. The unchecked code remains inline and visible; the language/borrow-checker no longer enforces some invariants there, so responsibility shifts to the programmer at the lexical site.

Practical differences reviewers should understand:

- Granularity: `unsafe` is lexical (blocks/functions); SPARK's hatch is modular/procedural (spec vs body).

- Visibility: Rust's unchecked code is written inline and annotated with `unsafe`; SPARK's unchecked implementation is placed outside the prover but the spec remains visible and provable.

- Enforcement model: Rust's safety holes are enforced/annotated by the compiler's type/borrow rules (caller and callee responsibilities are explicit at the site). SPARK enforces sound proofs on the interface; the off-proof body must be shown (by review/tests/docs) to meet the proven contract.

- Best practice: keep off-proof bodies tiny, give strong pre/postconditions on the SPARK spec, minimize the exposed surface, and rigorously review and test the non-SPARK implementation.

TL;DR: `unsafe` = "this code block bypasses language checks"; SPARK's escape hatch = "this implementation is deliberately outside the prover, but the interface is still what we formally prove against."

I think I did say:

> although none is as integrated as SPARK I believe

And yes, they're experimental (for now). But some are also used in production. For example, AWS uses Kani for some of their code, and recently launched a program to formally verify the Rust standard library.

Whether the language was designed for it does not matter, as long as it works. And it works well.

> `unsafe` blocks create unverifiable boundaries

Few of the tools can verify unsafe code is free of UB, e.g. https://github.com/verus-lang/verus. Also, since unsafe code should be small and well-encapsulated, this is less of a problem.

> the trait system and lifetime inference are too complex to model completely

You don't need to prove anything about them: they're purely a type level thing. At the level these tools are operating, (almost) nothing remains from them.

> procedural macros generate code that can't be statically verified

The code that procedural macros generate is visible to these tools and they can verify it well.

> interior mutability (`Cell`, `RefCell`) bypasses type system guarantees

Although it's indeed harder, some of the tools do support interior mutability (with extra annotations I believe).

> Rust can panic in safe code

That is not a problem - in fact most of them prove precisely that: that code does not panic.

> Most critically, Rust lacks a formal language specification with mathematical semantics

That is a bit of a problem, but not much since you can follow what rustc does (and in fact it's easier for these tools, since they integrate with rustc).

> Rust's semantics are informally specified and constantly evolving (async, const generics, GATs)

Many of those advancements are completely erased at the levels these tools are operating. The rest does need to be handled, and the interface to rustc is unstable. But you can always pin your Rust version.

> This isn't tooling immaturity though, it's a consequence of language design.

No it's not, Rust is very well amenable to formal verification, despite, as you said, not being designed for it (due to the shared xor mutable rule, as I said), Perhaps even more amenable than Ada.

Also this whole comment seems unfair to Rust since, if I understand correctly, SPARK also does not support major parts of Ada (maybe there aren't unsupported features, but you not all features are fully supported). As I said I know nothing about Ada or SPARK, but if we compare the percentage of the language the tools are supporting, I won't be surprised if that of the Rust tools is bigger (despite Rust being a bigger language). These tools just support Rust really well.

> Whether the language was designed for it does not matter, as long as it works. And it works well.

It matters fundamentally. "Works well" for research projects or limited AWS components is not equivalent to DO-178C Level A certification where mathematical proof is required. The verification community distinguishes between "we verified some properties of some code" and "the language guarantees these properties for all conforming code."

> Few of the tools can verify unsafe code is free of UB

With heavy annotation burden, for specific patterns. SPARK has no unsafe - the entire language is verifiable. That's the difference between "can be verified with effort" and "is verified by construction."

> You don't need to prove anything about [traits/lifetimes]: they're purely a type level thing

Trait resolution determines which code executes (monomorphization). Lifetimes affect drop order and program semantics. These aren't erased - they're compiled into the code being verified. SPARK's type system is verifiable; Rust's requires the verifier to trust the compiler's type checker.

> The code that procedural macros generate is visible to these tools

The macro logic is unverified Rust code executing at compile time. A bug in the macro generates incorrect code that may pass verification. SPARK has no equivalent escape hatch.

> some of the tools do support interior mutability (with extra annotations I believe)

Exactly - manual annotation burden. SPARK's verification is automatic for all conforming code. The percentage of manual proof effort is a critical metric in formal verification.

> That is not a problem - in fact most of them prove precisely that: that code does not panic

So they're doing what SPARK does automatically - proving absence of runtime errors. But SPARK guarantees this for the language; Rust tools must verify it per codebase.

> you can follow what rustc does (and in fact it's easier for these tools, since they integrate with rustc)

"Follow the compiler's behavior" is not formal semantics. Formal verification requires mathematical definitions independent of implementation. This is why SPARK has an ISO standard with formal semantics, not "watch what GNAT does."

> Rust is very well amenable to formal verification [...] Perhaps even more amenable than Ada

Then why doesn't it have DO-178C, EN 50128, or ISO 26262 certification toolchains after a decade? SPARK achieved this because verification was the design goal. Rust's design goals were different - and valid - but they weren't formal verifiability.

> SPARK also does not support major parts of Ada

Correct - by design. SPARK excludes features incompatible with efficient verification (unrestricted pointers, exceptions in contracts, controlled types). This is intentional subsetting for verification. Claiming Rust tools "support more of Rust" ignores that some Rust features are fundamentally unverifiable without massive annotation burden.

The core issue: you're comparing research tools that can verify some properties of some Rust programs with significant manual effort, to a language designed so that conforming programs are automatically verifiable with mathematical guarantees. These are different categories of assurance.

> No it's not, Rust is very well amenable to formal verification, despite, as you said, not being designed for it (due to the shared xor mutable rule, as I said), Perhaps even more amenable than Ada.

I would like to add a few clarifications that I may not have mentioned in my other reply.

You are correct that Rust's ownership/borrow model simplifies many verification tasks: the borrow checker removes a great deal of aliasing complexity, and that has enabled substantial research and tool development (RustBelt, Prusti, Verus, Creusot, Aeneas, etc.). That point is valid.

However, it is misleading to claim Rust is plainly more amenable to formal verification than Ada. SPARK is a deliberately restricted subset of Ada designed from the ground up for static proof: it ships with an integrated, industrial-strength toolchain (GNATprove) and established workflows for proving AoRTE and other certification-relevant properties. Rust's type system gives excellent leverage for many proofs, but SPARK/Ada today provide a more mature, production-proven path for whole-program safety and certification. Which is preferable therefore depends on what you need to verify - research or selected components versus whole-program, auditable certification evidence.

SPARK/Ada is used in many mission-critical industries (avionics, rail, space, nuclear, medical devices) for a reason: the language subset, toolchain, and development practices are engineered to produce certifiable evidence and demonstrable assurance cases.

Rust brings superior language ergonomics and strong compile-time aliasing guarantees, but it faces structural barriers that make SPARK's level of formal verification fundamentally unreachable. These are not matters of tooling immaturity, but of language design and semantics:

- Rust allows pervasive unsafe code, which escapes the borrow checker's guarantees. Every unsafe block must be modelled and verified separately, defeating whole-program reasoning. SPARK forbids such unchecked escape hatches within the verified subset.

- Rust's semantics include undefined behavior and panics, which cannot be statically ruled out by the compiler. SPARK, by contrast, can prove statically that such run-time errors are impossible.

- Rust's rich features (lifetimes, traits, interior mutability, macros, async, etc.) greatly complicate formal semantics. SPARK deliberately restricts such constructs to preserve provable determinism.

- Rust lacks a single, formally specified, stable verification subset. SPARK's subset is precisely defined and stable, with a formal semantics that proofs can rely on across versions.

- Rust's verification ecosystem is fragmented and research-oriented (Prusti, Verus, Creusot, RustBelt), whereas SPARK's GNATprove toolchain is unified, production-proven, and already qualified for use in DO-178C, EN-50128, and IEC-61508 workflows.

- Certification for Rust toolchains (qualified compilers, MC/DC coverage, auditable artifacts) is only beginning to emerge; SPARK/Ada toolchains have delivered such evidence for decades.

In short, Rust's design - allowing unsafe code, undefined behavior, and a complex evolving feature set - makes SPARK-level whole-program, certifiable formal verification structurally impossible. SPARK is not merely a verifier bolted onto Ada: it is a rigorously defined, verifiable subset with an integrated proof toolchain and an industrial certification pedigree that Rust simply cannot replicate within its current design philosophy.

If your objective is immediately auditable, whole-program AoRTE proofs accepted by certifying authorities today, SPARK is the practical choice.

I hope this sheds some light on why SPARK's verification model remains unique and why Rust, by design, cannot fully replicate it.