for context: in rust the "translation unit" is a crate (in C++, each source file is its own translation unit). So you only get parallelism across crates when compiling in rust. When you have a single-crate project, this means that
1. you get parallelism across all your dependencies, but
2. your (final) crate is serial.
Splitting your crate can therefore get you parallelism back in step 2, which can be a (compilation) perf win. You can also get a caching benefit when there aren't changes, but even absent that you can get wins.
It can come at the cost of runtime performance though, as rust won't (by default) do things like inline across translation units iirc. You can get back some of this perf back via various tricks (for example link-time optimization).
Looks like the rust performance book has some writing on these tradeoffs. I haven't read these articles, but the table of contents on the left seems reasonable enough.
Again I haven't read that (so I'm assuming it says the following), but you can get "best of both worlds" by configuring debug builds to not do LTO (= faster compilation speed, slower runtime), and release builds to do LTO (= faster runtime, slower compilation speed). There are also variants of LTO that make various tradeoffs you could look into.
That article also links to the `cargo-wizard` subcommand of cargo
it won't auto-split crates for you (of course), but it does seem to give you some default configurations of `cargo`, one of which configures for faster compilation speed. could be an easy way to mess around with things.
Thanks, that's helpful to keep in mind. Looks like you can fiddle with organizing the project in way where you can balance 1 and 2 in a reasonable way. Right now, it's definitely the second step that's killing me where assembling the crates together takes forever.
you can try doing `cargo build --timings ...`. It will generate a report of how long each crate takes to compile etc. It sounds like you know that your final crate is the culprit, but this would let you confirm it.
If you suspect the issue is assembling all the files together (e.g. linking) you can see some advice on optimizing link speeds here
Note that there could be other causes of slow compilation of the final binary. For example, if you heavily use (especially procedural) macros, it's known to make compilation quite slow.
1. you get parallelism across all your dependencies, but
2. your (final) crate is serial.
Splitting your crate can therefore get you parallelism back in step 2, which can be a (compilation) perf win. You can also get a caching benefit when there aren't changes, but even absent that you can get wins.
It can come at the cost of runtime performance though, as rust won't (by default) do things like inline across translation units iirc. You can get back some of this perf back via various tricks (for example link-time optimization).
Looks like the rust performance book has some writing on these tradeoffs. I haven't read these articles, but the table of contents on the left seems reasonable enough.
https://nnethercote.github.io/perf-book/build-configuration....
Again I haven't read that (so I'm assuming it says the following), but you can get "best of both worlds" by configuring debug builds to not do LTO (= faster compilation speed, slower runtime), and release builds to do LTO (= faster runtime, slower compilation speed). There are also variants of LTO that make various tradeoffs you could look into.
That article also links to the `cargo-wizard` subcommand of cargo
https://github.com/Kobzol/cargo-wizard
it won't auto-split crates for you (of course), but it does seem to give you some default configurations of `cargo`, one of which configures for faster compilation speed. could be an easy way to mess around with things.