| Passing arguments in registers is not an optimization, but an ABI. It always happens up to a certain number of arguments, and Rust in particular uses an ABI that flattens more structs into registers than C++. Other moves could be memcpy, but there's a distinction between Rust saying moves behave like memcpy, and moves actually being memcpys. String's 3×size_t (or 2×size_t for Box<str>/Arc<str>) is below LLVM's threshold of actual memcpy call. Rust has optimization passes for eliminating redundant moves. You're giving an impression that memcpy happens all over the place, where in reality it's quite rare, and certainly doesn't happen in the simple cases you describe. In Rust, knowledge of ownership and the zoo of strings is a requirement (e.g. use of &String is a novice error). It's nice that Mojo can hide it, and you could celebrate that without making dubious performance claims. > True if you want it to be immutable, but this actually adds to my point. Sigh, it adds to inaccuracies. Mutable strings are &mut String, passed as a single pointer, so a mutable string is an even better case of a thin reference that doesn't need memcpy. > those values need to be available to the end of scope to satisfy Rust's guarantees No they don't. You're conflating Rust's guaranteed Drop order (which does interfere with TCO) with borrow checking and stack usage, which don't. For references and Copy types, Rust has an "eager drop" behavior. Their existence on the stack is not guaranteed nor necessary. Borrow checking scopes are hypothetical for the sake of the check, and don't influence code generation in any way. You can literally remove borrow checker and lifetimes from Rust entirely, and the code will compile to the same instructions — mrustc implementation is a real example of that. Your example function where you try to demonstrate how the arguments prevent TCO compiles to a single `ret`. > it should be instant and faster than Mojo The `factorial()` is instant, but the `black_box` isn't because Rust/Criterion implements it differently than Mojo. So Mojo has a faster `benchmark.keep` function, and you failed to benchmark the relevant function, and presented a misleading benchmark with a wrong conclusion. You should validate your claims on what Rust does by actually checking the output. Try https://rust.godbolt.org/ (don't forget to add -O to flags!) or using cargo-show-asm. |