Hacker News new | ask | show | jobs
by HippoBaro 424 days ago
Eminently pragmatic solution — I like it. In Rust, a crate is a compilation unit, and the compiler has limited parallelism opportunities, especially since rustc offloads much of the work to LLVM, which is largely single-threaded.

It’s not surprising they didn’t see a linear speedup from splitting into so many crates. The compiler now produces a large number of intermediate object files that must be read back and linked into the final binary. On top of that, rustc caches a significant amount of semantic information — lifetimes, trait resolutions, type inference — much of which now has to be recomputed for each crate, including dependencies. That introduces a lot of redundant work.

I also would expect this to hurt runtime performance as it likely reduces inlining opportunities (unless LTO is really good now?)

1 comments

They mention that compiling one crate at a time (-j1) doesnt give the 7x slowdown, which rules out the object file/caching-in-rustc theories... I think the only explanation is the rustcs are sharing limited L3 cache.
The L3 cache angle is one of our hypotheses too. But it doesn't seem like we can do much about it.