It makes me wonder though if illumos is worth it for a relatively small company to maintain. This bug came out of the larger ecosystem not knowing what to do for a niche OS.
What we're doing inherently requires deep integration up and down the stack. We'd still have to be doing OS-level work even if we used another operating system. But then we'd be at the mercy of upstream of accepting patches, or keeping our own fork, and at that point, you're basically at the same spot we are now, but with less overall control.
> ...But then we'd be at the mercy of upstream of accepting patches,
This point bothers me, but I can't say with confidence that it's completely wrong. I know there are occasional rifts within the open source world, but I wish I knew two things:
1) How much overhead (in totality) is there when contributing to a project you don't control?
2) How different is the end result of collaboration between distinct groups or individuals versus doing things separately?
It depends on the project, and we do contribute upstream to other things all the time.
My comment wasn't so much about the overhead of collaboration, but of the chance of there being significant differences in opinion, leading to a place where we'd basically have to fork anyway. Remember, in this specific context, we're talking about an operating system and hypervisor that are core to our product, and we're building our own hardware.
You can't get one single answer for these questions for the entirety of the open source community. Even cross-language norms can be different. These things are inherently tradeoffs.
I wasn't part of making the decision to use illumos, but having an extensive history of open source contributions I'm confident it is the right one (at least on this axis).
On top of what Steve said, illumos does support all of the required APIs here, but the Rust libc crate was just missing definitions for them. It's not a tremendously exotic platform the way something like Haiku is.
Edit: also worth pointing out (again) that the bug actually exists everywhere -- it was just being masked on the other platforms.
Last year I got sucked into poking at some of the cross building rust issues (specifically issues targeting Solaris and BSD and issues cross hosting on macOS). Illumos isn't terribly exotic but the rust bootstrapping process has a few rough edges that will cut you. Illumos suffers mainly because it's not popular enough to get a ton of attention by the rustc folks and because it's not quite Solaris.
That said, for CI, cross building is much easier to scale than tracking down every permutation. For something like Illumos, that's not too bad. But for Solaris/SPARC? Heh.
Since cranelift-codegen was an optional component that can be disabled,
the x.py tooling could notice if a build failed in such a component
I think there's quite a bit of utility in ensuring that as much of the rust core builds. As cranelift-codegen is optional, the scripts should be able to bundle up a distribution with everything that succeeded.
Edit: Just took a quick look, and it sure looks like cranelift isn't built by default (at least that's what config.example.toml says and the none of the defaults available via 'x setup' seem to override that).
Weird. I'm looking at master right now and `config.example.toml` has this comment:
# This is an array of the codegen backends that will be compiled for the rustc
# that's being compiled. The default is to only build the LLVM codegen backend,
# and currently the only standard options supported are `"llvm"`, `"cranelift"`
# and `"gcc"`. The first backend in this list will be used as default by rustc
# when no explicit backend is specified.
#codegen-backends = ["llvm"]
Now I've not messed with the build profiles at all, and I don't have the repo checked out so digging through it is tedious, but my assumption is the library profiles work by copying everything from src/bootstrap/defaults/config.library.toml into a config.toml at the current directory. There's nothing overriding the default codegen-backends value that I can see.
The defaults for the Config struct are set in src/bootstrap/src/core/config/config.rs and codegen-backends is indeed just "llvm" (line 1167).
Nothing in src/bootstrap/src/core/build_steps/compile.rs appears to override that list.
So that's all very curious (to me). Did the gcc backend get built as well?
Tangentially: a year on and the Github interface is still nasty to use – and one of the big motivating factor for me backing off of hacking on the cross build stuff. Every day seems to bring a new WTF moment. If I could get one thing for my birthday it would be for Rust to wean itself off of Github.
RFD 26 talked about the context around this choice: https://rfd.shared.oxide.computer/rfd/0026
There's a lot more to it than just this one thing I mentioned :)