Hacker News new | ask | show | jobs
by fl0ki 858 days ago
I agree, but I have yet to see a single real-world example of a Rust project meaningfully reducing its binary size by switching from monomorphization to dynamic dispatch in its own code. Many Rust developers boast that they virtually never use `dyn`, but then still appeal to it when arguing that Rust has dynamic dispatch so monomorphization is an avoidable cost.

Sometimes you can provide `T = Arc/Box<dyn Foo>` where `T: Foo` is required, but only if the trait is designed to be object-safe, not simply by default. If you get to design the trait and all of its consumers yourself, you might have this option, but it's very possible that you're using a library that does not make this possible. You can easily be the first person to bother trying the `dyn` for a trait and running into these limitations.

Besides that, you might not even have that much control of the concrete type used. For example, if you are generating large schemas with serde, serde decides how that code is monomorphized, not you. In contrast, for better or worse, the path of least resistance in Go is to use a reflection-based serialization framework which has notable runtime costs (that may or may not matter to a given project) but successfully avoids compile time and binary size costs. (There are other reasons that Go binaries end up even larger than Rust ones, this just isn't one of them)

Despite Rust's general principle of giving its users informed choices here, I am not aware of any option that does 100% dynamic dispatch for (de)serialization, so in practice this is a largely unavoidable cost in each project that is decided only by how complex the schema is.

It's also only fair to point out that C++ tends to end up in this place too, mitigated only by dynamic linking and not any magical property of the language itself. Even C can head this way because monomorphizing with macros has the same effect, though due to how such code is structured, it's also less likely to be inlined than C++ or Rust.

1 comments

That's a fair observation, I know when I was first writing Rust my inclination was to return impl IntoIterator<Item = T> from functions which are going to actually return a Vec<T> because hey, if I change my mind you can still iterate over whatever I give you now instead with no code changes.

But of course that's an anti-pattern because they are in reality likely to forever just return Vec<T> and knowing that helps you. My early choice only makes sense if either I can't tell you anything more specific than impl IntoIterator<Item = T> or I already know I intend to make a change later. So these days I almost always write down what exactly is returned unless either I can't name it or no reasonable person would care.

For serde in particular my guess is that if you need lots of dynamism serde is the wrong approach even though it's popular. It might be interesting to build a different project which focuses on dynamic dispatch for the same work and tries to re-use as much of the serde eco-system as possible. Not work which attracts me though.

Note that `impl Foo` return types don't actually cost anything extra with regards to code-size, the compiler knows what the actual type is and there is no dynamic dispatch. Only actual generics have an impact here, and `impl` in a return position doesn't count.
The code size cost doesn't live in my code, but in yours.

Because I didn't admit you were getting a Vec, if you actually need a Vec you actually can't just use the one I gave you. You must jump though hoops to turn whatever I gave you into a Vec, bloating your code.

The implementation is pretty clever, it is probably not going to meticulously take my Vec to pieces, throw it away and make you a new one, instead just giving the same Vec. But this trick is fragile, so much better not to even need it.

Maybe a more specific way to put it is: you only pay for the (combinations of) types you actually use, whether that's in argument position, return position, or even a local binding. So if it's always Vec<T> it's not costing much more in compile time or code size, but if it's sometimes another type then you do now pay for both.