Hacker News new | ask | show | jobs
by maemre 1478 days ago
I think the answer is a combination of both and inlining. With Rc, the compiler can inline the reference count increment/decrements as well as the non-public function add_numbers and the surrounding code is simple enough that it can do dead code elimination as well as other optimizations to remove the unobservable behavior. If you switch to use Arc (which is a better equivalent of shared_ptr as it also does locking/unlocking), the generated code is more comparable (funnily enough, part of the code for drop in Arc is deliberately not inlined so it prevents this optimization [1]).

Also, if one makes add_numbers public, the generated code is much larger. I don't know why the compiler chooses not to inline it in this case, probably (really speculating here) because it can be called in multiple locations and the IR for the function is over the inlining budget.

[1]: https://doc.rust-lang.org/src/alloc/sync.rs.html#1088