Hacker News new | ask | show | jobs
by Dr_Emann 853 days ago
Rust optimizes factorial to be iterative, not using recursion (tail or otherwise) at all, and it turns `factorial(15, 1)` into `1307674368000`: https://rust.godbolt.org/z/bGrWfYKrP. As has been pointed out a few times, you're benchmarking `criterion::black_box` vs `benchmark.keep` (try the newer `std::hint::black_box`, which is built into the compiler and should have lower overhead)

And no: in the example with `&String` and `usize`, the stack isn't growing: https://rust.godbolt.org/z/6zW6WfGE7

2 comments

I updated the blog with full benchmark reproduction instructions, I also removed criterion::black_box altogether, and it resulted in no performance difference. Removing benchmark.keep from Mojo causes it to optimize away everything and run in less than a picosecond.

If you could show me a benchmark that supports what you're saying that'd be great, thanks.

I did a lot more benchmarks and Rust TCO is happening in a lot of scenarios. Thanks for pointing this out, I updated this section in the blog.
Hi, I think even the remaining benchmark isn't showing what you're trying to show:

https://rust.godbolt.org/z/r9rP6xohb

Rust realizes the vector is never used, and so never does any allocation, or recursion, it just turns into a loop to count up to 999_999_999.

And some back of the napkin math says there's no way either benchmark is actually allocating anything. Even if malloc took 1 nanosecond (it _doesn't_), 999_999_999 nanoseconds is 0.999999999 seconds.

It _is_ somewhat surprising that rust doesn't realize the loop can be completely optimized away, like it does without the unused Vec, but this benchmark still isn't showing what you're trying to show.

True thanks! I updated the example again, profiled this time to make sure each program is actually allocating.