Having Linux make the first move, then learn from their successes and mistakes to implement a better version in your own OS seems like a good move that profits everyone.
It's not like Microsoft isn't pushing Rust on their own platform. Every time a Windows driver does a use-after-free somebody swears at Microsoft for causing a bluescreen.
So why not formal verification or static analysis tools of an existing codebase (Frama-C)? Or dropping in C derived from a formally verified source (Idris / F-Star)? Or a language with good performance and a verified subset (Ada/Spark)?
I've never read anything to make me think that Rust is a safer alternative than any of these options; the guarantees of the language are just assurances of its advocates.
Its not like Microsoft does not work on that, they probably have a dozen different formally verified programming languages around at the moment. You already mentioned F*[0], but there is also Checked C[1], VCC[2], koka[3], dafny[4], Project Verona and probably some more that I am missing. The point being that they have clearly evaluated and have used F-Star for parts of the OS already.
As one of the developers of this patchset, here is my simple answer: try it. I'd love to see it.
I know Rust; I don't know Frama-C or Idris or Ada. What little I know of Ada indicates that it doesn't actually solve the problem because it doesn't have safe dynamic memory allocation. Ada advocates have told me I'm wrong and there's some new thing (inspired by Rust) that does this. Maybe! I'm not sure there's any benefit to the world in arguing it; someone skilled in Ada should just write something and show how it would actually work.
Write a kernel module with any of those techniques, send in a patch, and see what people say. Any effective means of making the kernel less buggy is great. It doesn't have to be Rust.
I claim that Rust is effective because, well, those patches are there, and frankly I haven't been doing any work for this project in almost a year and we're getting so many enthusiastic contributors. So, apparently, a lot of other people also know or are willing to learn Rust. Maybe it's actually better than the alternatives you mention; maybe it just has better documentation. I don't know. I'm not claiming Rust is perfect, just that it's empirically likely to solve the problem.
If you can get together a community of folks who can do any of these other approaches, that would be fantastic. Try it!
> If memory allocation fails, an exception is thrown and a retry with less memory requirements is a possible execution path.
So, since we're talking about the Linux kernel, that's a red flag straight away. Notice a whole section of this new LKML post is about their implementation of Rust's alloc crate? Linus doesn't want useless exceptions here and has previously objected on that basis.
Meanwhile... Over in the C++ world they have this same behaviour. For them these exception handlers are a fiction, in reality popular standard libraries (never mind everything else you're using) will blow up in low memory and it's game over. Maybe Ada does a lot better. Maybe.
As one of the developers of this patchset, can you comment on how these contributions will be licensed? Can you also comment on how a third party can verify or audit the output of a compiler without a specification?
>just that it's empirically likely to solve the problem.
Can you comment on what empirical evidence you based this statement on?
> Can you also comment on how a third party can verify or audit the output of a compiler without a specification?
I imagine it's pretty similar to how you verify or audit the output of a compiler with a specification. If you can show me an example of the sort of verification or audit you envision, I can perhaps give you a more helpful comment.
We do not make any claims about verified or audited compiler outputs or compliance with specifications, and to the best of my knowledge, the Linux kernel does not currently verify or audit any compiler outputs, so I'm not sure I follow why I would be particularly qualified to comment on it, but I could maybe point you at some interesting research.
> Can you comment on what empirical evidence you based this statement on?
Yes, I did comment on this, in the paragraph you quoted. If you'd like me to expand on something can you clarify?
>I imagine it's pretty similar to how you verify or audit the output of a compiler with a specification.
I must be missing the humor here. You can test against a specification to verify implementation. Testing an implementation against itself is meaningless. I would think that this is obvious. Also I don't typically use the word "empirical" to indicate purely qualitative assessments or intuition. You also provided a dead link.
Without being an expert in either language, given what Ada is used for, I'm sure it offers at least all of the safety Rust offers and probably more.
But Rust has a lot more going for it than safety. Maybe it shouldn't matter for kernel code, but the focus on developer "ergonomics," high-level functional constructs and pretty flexible pattern matching, supporting both ML-style and Algol-style syntax, very strong language-level tooling for building, testing, packaging, and generating documentation, plus the fact that the language itself is so well documented with great, accessible tutorials.
There's a reason it keeps winning Stack Overflow's most loved language award every year. It's easy to learn for people coming from almost any kind of programming background, supportive of newbs, and equally well-suited for application-level and system-level development, so it ends up with pretty widespread appeal.
So again, I don't know that any of this should matter for selecting what languages you're going to support in the kernel, but as a purely practical matter, it's a lot easier to ask contributors to use a language they already know and use elsewhere than it is to ask them to learn something new that they don't use anywhere else, which is what Ada and Idris and what not would be for most developers.
Rust has far more enthusiasm from potential contributors than any of those options. Onboarding something is meaningless if it's not going to be used and maintained.
FWIW I would trust anything language-level over applying static analysis tools to C (for an evolving codebase like Linux, having the safety be a first-class part of the codebase is a must) or extracting C from another language (the semantics of C are surprisingly weak, such an extraction can only ever be as good as your model of C which will probably be buggy).
> I've never read anything to make me think that Rust is a safer alternative than any of these options
What Rust focuses on is providing a scalable approach to a limited kind of formal verification - focusing purely on type- and memory-safety, not conformance to arbitrary specifications. The latter is of course being worked on, but type safety and memory safety are table stakes.
I can't think of a case of a system like that being revived after 10 or more years. My best guess is that the mismatch between the environment the language was written in and the modern world simply becomes too great for anyone to care after a while.
For instance, I have to deal with JSON. There's no reason older languages can't deal with JSON, but it'll often be inconvenient as the libraries try to deal with mismatches in the type system or whatever, and it just can't ever be quite as slick as a more modern language written with some partial, vague combination of the fact that JSON exists in mind, together with the sorts of techniques used to implement JSON support in modern languages. Or sometimes it'll just be the dominant library for dealing with it just has an old design that baked bad assumptions in, and since it's the dominant library it's the one everything uses, so I have to deal with that everywhere.
And it's just the death of a thousand of those cuts. "But you can do JSON in my favored language!" Yeah, but it's just not quite as slick. "But we have an HTTP client in this language!" Yeah, but it's just not quite as slick. "But we have GPU support in my favorite language!" Yeah, but it's just not quite as slick. And you've ground all that "not quite as slick" deeply into your language over the decades.
So nobody has any one reason they can point at as to why Ada isn't going come back, or why E isn't going to come back, or why Cyclone isn't going to come back... but the reality is that the sum total of the issues is more than sufficient to prevent it from ever happening, and what's going to happen is a more modern language is going to get the support.
Advocates also try to "defeat in detail", in analogy to the military technique, arguments to the contrary, but the problem is that while that may satisfy the advocate, everyone else is still left looking at a metaphorical battlefield that has an exhausting number of "battles" over basic deficiencies, and being "defeated in detail" by an advocate just transfers to exhaustion with the underlying language anyhow.
It probably isn't a "the issue" with Ada/SPARK. It's a horde of small little things, the sum total of which is too much hassle for anyone to want to deal with.
What kind of scalable approach are you describing? Rust is not formally verifiable. I'm aware that there are projects attempting to solve this problem, however as I understand it you can't currently create verifiable production code.
> focusing purely on type- and memory-safety, not conformance to arbitrary specifications
What are types if not arbitrary specifications? Are you suggesting that formally verified software is not memory safe? What on earth does "scalable" even mean? A verified program cannot scale? If not, why not?
Types are not 'arbitrary' in any real sense. Type systems can be described as a sort of formal verification that leverages and is closely aligned with the syntax structure of actual code, thus type systems can preserve compositionality but don't generally support non-local reasoning, at least not in a very elegant way. This is exactly what can make them potentially more scalable than other sorts of FV, hence more useful for programming "in the large".
Types that can be defined by users are, necessarily, part of a language syntax, so that they are aligned with the syntax is just a tautology. Equating user defined types with formal verification is just nonsense. C has types. Python has types. Rust has types. That the program does what you think it should do does not logically follow the mere existence of types. I have no idea what "in the large" is supposed to mean. I feel like GPT-3 is arguing with me.
I believe what zozbot234 is saying is that Rust focuses on a specific subset of formal verification, and on trying to make a language with that enforces that subset in a way that is still somewhat ergonomic.
I'm assuming "scalable" in this case is referring to the fact that fully verifying a piece of software is currently takes a large amount of effort. Rust's approach is more "scalable" in the sense that they decided to focus primarily on memory safety, and punted on other forms of verification in the hope that it would be more ergonomic/require less effort.
> I've never read anything to make me think that Rust is a safer alternative than any of these options; the guarantees of the language are just assurances of its advocates.
Note that some formal verification work has been done on the guarantees that the borrow checked provides (which led to finding and fixing an edge case bug), so there is a little more concrete evidence that just the "assurances of its advocates"
I don't think that the mere existence of a "working group" or grant funded research positions for verifying rustc or a subset thereof is the same as having a verified compiler. That's certainly not a strong argument to against using rustc for safety over one of the many verified compilers that actually exist. At least, I didn't see in the "rustbelt" list of papers anything which appeared to indicate that a verified compiler or language subset do in fact exist. Furthermore, that the proposed patches rely on the nightly compiler and a special variant of rust's core, which can only be enforced by special feature tags seems to suggest that Rust-Linux is not being written in this hypothetical, future compiler.
> At least, I didn't see in the "rustbelt" list of papers anything which appeared to indicate that a verified compiler or language subset do in fact exist.
You might want to look harder, because there is certainly a verified language subset (anything that can be written as LambdaRust, which covers a large and substantial swathe of the language).
If you're looking for a formally verified, optimizing compiler for any realistic language, particularly a low level language, you have pretty much one option: CompCert (I'm not sure where this large group of verified compilers you claim exists lives, but I'd love to see them). As someone who would love a verified Rust toolchain, I can't help but notice that very few safety-critical systems actually use CompCert in practice, which makes me a bit suspicious of claims that this is something Rust needs to be competitive in this arena.
Finally, I'll point out that verified compilers like CompCert only promise to translate correct C to correct machine code; they do not necessarily aid you in coming up with correct C in the first place. Miscompilations certainly happen, but not nearly as often as undefined behavior in the source program. So asking why people would use Rust when verified compilers exist for other languages is sort of missing the point.
I think Compcert isn’t widely used because of its license. GNAT is also GPL. Everyone lost their mind over Oracle charging a nominal sum for their compiler. I don’t think rust would have its corporate backing if it had similar licensing.
Also, there is only undefined behavior in rust. The compilers behavior can change at any time for any reason any time somebody with commit rights thinks that it should. You cannot begin to assert provenance about some thing that can change at any time.
In a number of ways I'd argue that FV of an average CRUD app is a very hard problem, for two reasons:
- the requirements generally change quite frequently
- the requirements are generally a long way from being "formally verifiable"
Where I, personally, have found the most value so far using FV techniques is from modelling "tricky" things whose requirements are quite solid. A few examples from my work:
- A state machine for managing LoRa communication. The requirements in the LoRa spec are pretty good. I took those requirements and designed a state machine I could implement on a low-power device and modelled that SM in TLA+ to verify that there were no deadlocks/attempted transitions to invalid states.
- A state machine for managing the Bluetooth communication between an Android app, a Bluetooth-Iridium (satellite) bridge, and the server on the other side of the Iridium link. Similar to the LoRa case, I used TLA+ to ensure that there were no deadlocks in my proposed SM (and uhh made a number of corrections as I went). Ultimately, due to the design of the BT/Iridium gateway, the exercise resulted in the conclusion that it wasn't actually possible for this process to be 100% reliable but it was possible to detect and retry the edge cases.
- Modelling a somewhat awkward email/SMS invite flow for a mobile app. You could get an invitation over email or SMS, and your account transitioned through a few different states as both your email address and your phone number were verified. Modelling this helped ensure that your account couldn't get into a state where no forward progress was possible or into a state where you could use a half-completed account.
Just a note, I (personally) don’t consider TLA+ a FV technique — it is a really good technique that actually scales much better than formal verification programs like Agda or Coq.
It's formal verification of an abstract machine as opposed to running code. The problem is that formal methods as a field doesn't yet have a rich enough vocabulary to distinguish between its flavors. I use "code verification" and "design verification", which is a little more specific, but has its own issues.
Let us see what they have, using formal verification (Ada/SPARK).
- JWX is a library for handling JSON data and more. It is implemented in the SPARK programming language and has been proven to contain no runtime errors. As a result, JWX is particularly suited for processing untrusted information.
- The RecordFlux specification language is a domain-specific language to formally specify message formats of existing real-world binary protocols. Its syntax is inspired by Ada. A detailed description of the language elements can be found in the Language Reference.
- Formally verified, bounded-stack XML library
This is just Componolit.
Let us check the F* language:
- F* (pronounced F star) is a general-purpose functional programming language with effects aimed at program verification. It puts together the automation of an SMT-backed deductive verification tool with the expressive power of a proof assistant based on dependent types. After verification, F* programs can be extracted to efficient OCaml, F#, C, WASM, or ASM code. This enables verifying the functional correctness and security of realistic applications. The main ongoing use case of F* is building a verified, drop-in replacement for the whole HTTPS stack in Project Everest. This includes verified implementations of TLS 1.2 and 1.3 and of the underlying cryptographic primitives.
---
Given the above, what exactly do you mean that it does not scale?
Whoops, I completely forgot about libsparkcrypto (https://github.com/Componolit/libsparkcrypto) which is a "formally verified implementation of several widely used cryptographic algorithms using the SPARK 2014 programming language and toolset. For the complete library proofs of the absence of run-time errors like type range violations, division by zero and numerical overflows are available. Some of its subprograms include proofs of partial correctness.".
Looking at the examples you've provided, I'm pretty sure the Linux kernel's codebase dwarfs their combined codebases into insignificance. Since it is very hard to not have NP-hard asymptotic complexity when formally verifying a codebase as large and as demanding (in terms of performance, etc) as the Linux kernel, I would expect that if you tried to rewrite Linux in the languages you've mentioned and formally verify the rewrite, you will be able to keep the entire AWS cloud (globally) very busy for a very, very long time. On the other hand, if you were to do the rewrite in Rust and use some hypothetical formally verified Rust compiler to type check it (which is mathematically equivalent to such a formal verification in, say, F* limited to only verifying certain properties), it'll probably type check in minutes or even seconds on the laptop I'm currently writing this reply on.
For a less hypothetical example. RedoxOS has a codebase that dwarfs Sel4 (but it is still much smaller than Linux in turn), yet it type checks with much less compute cost than it takes to formally verify Sel4. Or heck, Redleaf actually formally verifies in a tiny fraction of the time it takes to formally verify Sel4, because the Rust type checker not only significantly reduces the amount of requirements that still needs to be formally verified but also reduces the complexity to verify those requirements to linear to the codebase size due to the restrictions that the Rust type system and module system puts on valid Rust programs (with a little bit of care when writing the code). Of course, until you have a certified Rust compiler there might still be miscompiles of Redleaf and until all the features of Rust used in the Linux kernel is formally specified, we cannot do the same with the Linux kernel as for Redleaf even if somehow all of Linux got rewritten in Rust. But in Rust at least it is conceivably possible given sufficient resources and sufficient time. I very much doubt it will ever be possible without help from a sufficiently expressive and practical type system.
Oh yea. Then there's another problem. Most (all?) of the examples you've mentioned have requirements that are well specified and pretty much set in stone. The requirements fro the Linux kernel... aren't quite as obliging.
It's security parameters are well-defined enough that you can use it to launch isolated VMs with... less secure OSes that are sandboxed from each other.
That's true, but as a microkernel, seL4 has a very limited scope. It's not in the same ballpark as, say, the Linux kernel. It's more like a hypervisor. [0]
From a quick google, seL4 has around 10,000 lines of C code, whereas the Linux kernel has around 30,000,000.
Also to note, ARM toyed with the idea of using a Linux distribution as alternative to MBed, but it was short love, it is already dead, it only lasted one release.
It's not like Microsoft isn't pushing Rust on their own platform. Every time a Windows driver does a use-after-free somebody swears at Microsoft for causing a bluescreen.