Sure, but the small compiler that you write in C can't compile rustc. So you write a new Rust compiler that uses much simpler Rust that the small compiler in C can compile. Then that new Rust compiler can compile rustc.
And since that new Rust compiler might not have much of an optimizer (if it even has one at all), then you recompile rustc with your just-compiled rustc.
Sometimes people like to do things just do do them, because the idea is cool. Sometimes an idea is cool because it has real world ramifications just for existing (trusting trust, supply chain attacks), though many people like to argue about whether or not this particular idea is Just Cool TM or Actually Useful TM.
I don’t think the article made any false pretenses either way- it seemed evident to me that at least some of the motivation was “because how? I want to do the how!” And that’s cool!
Also, I think the article pretty clearly redirects your question. Duh he could just cross compile rust. But the whole article, literally the entirety of it, is dedicated to exploring the chicken/egg problem of “if rustc is written in rust, what compiles rustc?”, and the rabbithole of similar questions that arise for other languages. The answer to that question being both “rustc of course”, and also “another compiler written in some other language.” The author wants to explore the process of making the compiler in [the/an] ‘other language’ because it’s Cool.
Just because something has already been done, does not mean there’s no worth in doing it- nor that there is only worth in doing it for educational and/or experiential purposes :)
Cross-compilation is orthogonal to bootstrapping, which is the major motivating factor for having something like what is described earlier in this thread: a small compiler written in C which can compile a subset of Rust, used to compile a larger, feature complete Rust compiler in that Rust subset -- versus what we have right now, which is a Rust compiler written in Rust which requires an existing Rust compiler binary, which means we have an unmitigated supply chain attack vector.
If you change your question to "why does anyone care about bootstrapping?", the answer would revolve around that aforementioned supply chain attack vector.
Perhaps you're comfortable with the lack of assurances that non-boostrapable builds entails (everyone has a different appetite for risk); some others aren't though, and so they have an interest in efforts to mitigate the security risks inherent in trusting a supply chain of opaque binaries.
Yes, they are because they want to target systems which explicitely disallow cross-compilation like Debian.
Yes, I think it's silly too but other people disagree and they are free to work on whatever they want.
Do I think it's a mostly pointless waste of time? Obviously, I do. Still, I guess there are worst ones.
Note that the Rust project does use cross-compilation for the ports it supports itself and considering the amount of time they use features only available in the current version of rustc in rustc, I guess it's safe to assume they share my opinion on the usefulness of keeping Rust bootstrappable.
* Bootstrapping a language on a new architecture using an existing architecture. With modern compilers this is usually done using cross compilation
* Bootstrapping a language on an architecture _without_ using another architecture
The latter is mostly done for theoretical purposes like reproducibility like reflections on trusting trust
Paring down a rust compiler in rust to only use a subset of rust features might not be a big lift. Then you only need to build a rust compiler (in C) that implements the features used by the pared-down rust compiler rather than the full language.
Pypy, for instance, implements RPython, which is a valid subset of Python. The compiler is written in RPython. The compiler code is limited on features, but it only needs to implement what RPython includes.
But that doesn't conform to the "Descent Principle" described in the article.
I haven't really been following Zig, but I still felt slightly disappointed when I learnt that they were just replacing a source-based bootstrapping compiler with a binary blob that someone generated and added to the source tree.
The thing that makes me uncomfortable with that approach is that if a certain kind of bug (or virus! [0]) is found in the compiler, it's possible that you have to fix the bug in multiple versions to rebootstrap, in case the bug (or virus!) manages to persist itself into the compilation output. The Dozer article talks about the eventual goal of removing all generated files from the rustc source tree, ie undoing what Zig recently decided to do.
If everything is reliably built from source, you can just fix any bugs by editing the current source files.
I think there is too much mysticism here in believing that the bootstrapping phases will offer any particular guarantees. Without essentially a formal proof that the output of the compiler is what you expect, you will have to manually inspect every aspect of every output phase of any bootstrapping process.
OK, so you decide to use Compcert C. You now have a proof that your object code is what your C code asked for. Do you have a formal proof of your C code? Have you proved that you have not allowed any surprises? If not, what is your Rust compiler? Junk piled on top of junk, from this standpoint.
On the other hand, you could have a verified WASM (or other VM) runner. That verified runner could run the output of a formally verified compiler (which Rustc is not). The trusted base is actually quite small if you had a fully specified language with a verified compiler. But you have to start with that trusted base, and something like a compiler written in C is not really enough to get you there.
For compilers specifically, I think plenty of people would disagree.
It's not that it's exceedingly hard in C, but programming languages have evolved in the last millenium, and there are indeed language features that make writing compilers easier than it used to be
I have the most fun when I write x86 MASM assembly. It's a pretty simple language all in all, even with the macro system. Much simpler than C.
But a simple language doesn't always make it simple to write complex programs like compilers.
It is really remarkably sucky to process trees without algebraic datatypes and full pattern matching. Most of your options for that are ML progeny, and the rest are mostly Lisps with a pattern-matching macro. While it’s definitely possible to implement, say, unification in C, I wouldn’t want to—and I happen to actually like C.
Given the task is to bootstrap Rust, a Rust subset is a reasonable and pragmatic choice if not literally the only one (Mes, a Lisp, could also work and is already part of the bootstrappable ecosystem).
Rust feels impossible to use until you "get" it. It eventually changes from fighting the borrow checker to a disbelief how you used to write programs without the assurances it gives.
And once you get past fighting the borrow checker it's a very productive language, with the standard containers and iterators you can get a lot done with high level code that looks more like Python than C.
I agree but it's not different than C with a decent library of data structures. And even when you become more borrow checker aware and able to anticipate most of the issues, still there are cases where the solution is either non obvious or requires doing things in indirect ways compared to C or C++.
The quality difference between generics and proc macros vs the hoops C jumps through instead is pretty significant. The way you solve this in C is also unobvious, but doesn't seem like it when you have a lot of C experience.
I've been programming in C for 20 years, and didn't realize how much of using it productively wasn't a skilful craft, but busywork that doesn't need to exist.
This may sound harsh, but sensitivity to order definition, and the fragility of headers combined with a global namespace is just a waste of time. These aren't problems worth caring about.
Every function having its own idea of error handling is also nuts. Having to be diligent about error checking and cleanup is not a point of pride, but a compiler deficiency.
Maintenance of build scripts is not only an unnecessary effort, but it makes everything downstream of them worse. I can literally not have build scripts at all, and be able to work on projects bigger than ever. I can open a large project, with an outrageous number of dependencies, and have it build on the first try, integrate with IDEs, generate API docs, run unit tests out of the box. Usually works on Windows too, because the POSIX vs Windows schism can be fixed with a good standard library and cross-platform dependency management.
Multi-threading can be the default standard for every function (automatically verified through the entire call graph including 3rd party code), and not an adventurous novelty.
Writing non-trivial programs is easier in Rust than in C, for people that are equally proficient in C as in Rust. Especially if you're allowed to use Cargo and the Rust crates ecosystem.
C isn't even in the same league as Rust when it comes to productivity – again, if you're equally proficient in Rust as in C.
I have 40 years of C muscle memory and it took me many tries and a real investment to get into Rust, but I don’t do any C anymore (even for maintenance- I’d rather rewrite it in Rust first).
Rust isn’t in a difference class from C, it’s a different universe!
You have to consider that those who write the Rust compiler are experts in Rust, but not necessarily experts in C. So even if writing programs in C may be simpler than in writing programs in Rust for some developers, the opposite is more likely in this case, even before we compare the merits of the respective languages.
This is 100% the case. All of the honest-to-god Rust experts I know work on the compiler in some way. Same goes for Lean, which bootstraps from C as well.
Writing programs that compile is much easier in C. It lets me accidentally do all sorts of ill-advised things that the Rust compiler will correctly yell at me about.
I don't remember it being any easier to write C that passes through a static analyzer like Coverity etc. than it is to write Rust. Think of rustc like a built-in static analyzer that won't let you ignore it. Sometimes that means it's harder to sneak bad ideas past the compiler.
You can now have trustworthy Rust compiler binaries, through the work of the Bootstrappable Builds community, which found a way to build a C compiler without having C compiler binaries yet.
You often write two compilers when trying to bootstrap a C compiler, as GCC used to do. Often, it's a very simple version of the language implemented in the architecture's assembly.