| Hi author here, definitely not trying to diss Rust, I love Rust! I'm pointing out some interesting overheads that aren't well known by the average Rust programmer, which Mojo was able to improve upon with the power of hindsight and being a newer thing. For your points below, let me clarify: > There's no implicit string copying in Rust. Even when passing String ownership, it will usually be passed in registers. The String metadata can be passed in registers if LLVM does that optimization, but it's not guaranteed and doesn't always happen. Rust move is just a memcpy, there are situations where LLVM doesn't optimize them away, resulting in Rust programs doing a lot more memcpy than people realize. > It's idiomatic to use `&str` by default. True if you want it to be immutable, but this actually adds to my point. That is the default behavior in Mojo without having to understand things like deref coercion and the difference between `&str` and `&String`. In Rust it's an unintuitive best practice, which everyone has to learn pretty early in their journey. In Mojo they get the best behavior by default, which gives them a more gentle learning curve, important for our Python audience. Default behavior > idiomatic things to learn. > The borrow checker doesn't influence destructors. I didn't claim that, my point was that Rust does do runtime checks using drop flags, to check if a value should be dropped. This can be done statically during compilation, but won't happen if the initialization state of an object is unknown at compile time: https://doc.rust-lang.org/nomicon/drop-flags.html > &String and usize can't have a destructor, and can be forgotten at any time In the example, the call stack is growing with a new reference and usize in each frame for each call. This is why tail recursion in Rust has so many issues, those values need to be available to the end of scope to satisfy Rust's guarantees, they can't be "forgotten at any time". It also overflows the stack a lot faster. > Their benchmark intended to demonstrate tail call cost compiles to a constant, with no `factorial()` calls at run time. They're only benchmarking calls to black_box(1307674368000). Criterion put read_volatile in there. Mojo probably uses a zero-cost LLVM intrinsic instead. If the Rust benchmark isn't calling `factorial()` it should be instant and faster than Mojo, the Rust version is must slower. `benchmark.keep` in Mojo is a "clobber" directive, indicating that the value could be read or written to at any time, so LLVM doesn't optimize away the function calls to get the result. Thanks for taking the time to read the post, and write out your thoughts. Really enjoying the discussion around these topics. |
And no: in the example with `&String` and `usize`, the stack isn't growing: https://rust.godbolt.org/z/6zW6WfGE7