Hacker News new | ask | show | jobs
by vlovich123 977 days ago
I really dislike the need to artificial decompose crates in this way. If Rust could accurately track dependencies at a more fine-grained level, then no one would do this because it adds complexity. It’s like optimizing assembly but for build systems.

What I mean by this is that if Rust tracked dependency information at the level of “my function too dependent on type declarations XYZ from crate A” vs “my function depends on the implementation of function in crate A”, then the Rust compiler would automatically apply this for you. It wouldn’t eliminate all dependency massaging, but it would eliminate the need for purely mechanical tasks and be more optimal (eg rearranging the internal fields of a crate shouldn’t dirty that dependency chain)

1 comments

That could indeed be done (and I want to make that happen at some point), but you must be aware that this is an optimization, and as such trivial unrelated code changes can make the compile times balloon because all of a sudden you added a field to a struct with a type that used to fall on the division boundary, and you went from multiple pseudo-crates to a single one. This kind of behavior is hell to debug and figure out as a user.
I'm not sure I'm following the scenario described, but I don't see how the debugging complexity is more difficult than at the crate level. Can you help me understand why this might be the case? I'm not sure what you're calling as division boundaries / pseudo-crates here (are you suggesting that the dependency tracking won't be fine-grained & trivial changes can cause false sharing or something else?).

Doing all this tracking correctly though across layers of compilers is tricky if you want to shoot for absolute optimal performance. At the limit for truly optimal behavior, you'd track all the input information the compiler used when generating code and map it back to the source level (i.e. did I inline function A into function B - then I need to regenerate function A if B changes, but if no inline then don't regenerate A), which can be hard as most compiler optimization passes are information destroying. So yeah, I don't doubt that whatever "practical" middle ground is found as a realistic implementation can be tricky to reason about, but I suspect it almost doesn't matter if done well because it'll always outperform what you would have done by hand as outlined in this post / do it for you for free in 90% of cases.

He's saying that if you explicitly separate your crates out, and then you accidentally introduce a circular dependency you will get a compile time error. But if you have one big crate and rely on the compiler to detect internal dependencies for performance, then when you accidentally introduce a dependency that you didn't mean to it will happily compile but it will be slower so you probably won't notice and it will be really hard to debug.