Hacker News new | ask | show | jobs
by tialaramex 854 days ago
I believe the menu of options is undesirable until you actually know you have requirements you can evaluate them against. As much as possible even if I do have a choice, there should be a default and I needn't be asked. When I make a Rust project, cargo notices I have git and, since I didn't say otherwise, it mints a Git repo for the new project automatically. It doesn't insist on asking if I want one, and then asking if it should have the obvious name, and then asking if it should use my default local git settings, the defaults for all of these are IMNSHO obvious and my tacit approval via not having explicitly turned this off is enough.

Do you want a Doodad, a Gooba or a Wumsy? No idea? Me either. So until I care, I'd rather not be asked to choose. But once I discover that I need something with at least 40% Flounce, I can see that Doodads and Goobas both are rated at 50% Flounce, whereas Wumsy has only 10% Flounce, now we're making an informed choice, it should be easy enough to insist on a Doodad to meet my requirement.

If I measure that Monomorphization is out of hand in my codebase I can use dyn to get that back under control for a fair price, but I think the default here is sound.

1 comments

I agree, but I have yet to see a single real-world example of a Rust project meaningfully reducing its binary size by switching from monomorphization to dynamic dispatch in its own code. Many Rust developers boast that they virtually never use `dyn`, but then still appeal to it when arguing that Rust has dynamic dispatch so monomorphization is an avoidable cost.

Sometimes you can provide `T = Arc/Box<dyn Foo>` where `T: Foo` is required, but only if the trait is designed to be object-safe, not simply by default. If you get to design the trait and all of its consumers yourself, you might have this option, but it's very possible that you're using a library that does not make this possible. You can easily be the first person to bother trying the `dyn` for a trait and running into these limitations.

Besides that, you might not even have that much control of the concrete type used. For example, if you are generating large schemas with serde, serde decides how that code is monomorphized, not you. In contrast, for better or worse, the path of least resistance in Go is to use a reflection-based serialization framework which has notable runtime costs (that may or may not matter to a given project) but successfully avoids compile time and binary size costs. (There are other reasons that Go binaries end up even larger than Rust ones, this just isn't one of them)

Despite Rust's general principle of giving its users informed choices here, I am not aware of any option that does 100% dynamic dispatch for (de)serialization, so in practice this is a largely unavoidable cost in each project that is decided only by how complex the schema is.

It's also only fair to point out that C++ tends to end up in this place too, mitigated only by dynamic linking and not any magical property of the language itself. Even C can head this way because monomorphizing with macros has the same effect, though due to how such code is structured, it's also less likely to be inlined than C++ or Rust.

That's a fair observation, I know when I was first writing Rust my inclination was to return impl IntoIterator<Item = T> from functions which are going to actually return a Vec<T> because hey, if I change my mind you can still iterate over whatever I give you now instead with no code changes.

But of course that's an anti-pattern because they are in reality likely to forever just return Vec<T> and knowing that helps you. My early choice only makes sense if either I can't tell you anything more specific than impl IntoIterator<Item = T> or I already know I intend to make a change later. So these days I almost always write down what exactly is returned unless either I can't name it or no reasonable person would care.

For serde in particular my guess is that if you need lots of dynamism serde is the wrong approach even though it's popular. It might be interesting to build a different project which focuses on dynamic dispatch for the same work and tries to re-use as much of the serde eco-system as possible. Not work which attracts me though.

Note that `impl Foo` return types don't actually cost anything extra with regards to code-size, the compiler knows what the actual type is and there is no dynamic dispatch. Only actual generics have an impact here, and `impl` in a return position doesn't count.
The code size cost doesn't live in my code, but in yours.

Because I didn't admit you were getting a Vec, if you actually need a Vec you actually can't just use the one I gave you. You must jump though hoops to turn whatever I gave you into a Vec, bloating your code.

The implementation is pretty clever, it is probably not going to meticulously take my Vec to pieces, throw it away and make you a new one, instead just giving the same Vec. But this trick is fragile, so much better not to even need it.

Maybe a more specific way to put it is: you only pay for the (combinations of) types you actually use, whether that's in argument position, return position, or even a local binding. So if it's always Vec<T> it's not costing much more in compile time or code size, but if it's sometimes another type then you do now pay for both.