Hacker News new | ask | show | jobs
by je_bailey 1309 days ago
I don’t understand the point of this. Is there a trade off in being stack efficient and speed?
4 comments

The main goal is removing replacing pointless stack-to-stack copies with simply mutating in-place on the stack correctly in the first place.

Due to some mix of:

* Rust code relying more on copies than C++ (for esample, harder to make something uninitialized and fill it in)

* LLVM missing optimizations that rust relies on heavier than C++

* No real guarantees around RVO / NRVO

Rust code often will put something on the stack, and then just instantly copy it somewhere else on the stack, even in optimized code. I've observed this happening sometimes pretty blatantly myself.

> No real guarantees around RVO / NRVO

Shouldn’t Rust in theory have a lot more freedom in defining its calling conventions than C++ has? I wonder if there’s anything that prevents doing RVO by default, or if just hadn’t been a priority yet.

I think in theory it could, but something was definitely getting clogged in the optimizer. I'd see code like

" a = A::new(...);

return a; "

Create a on the stack, and immediately copy a into the stack region the caller was expecting it in. This seemed to get worse as struct size got larger, so I'm guessing there was so much IL the optimizer had to churn through it just gave up at some point.

Evaluation order is unspecified in C++, whereas it is well-specified in Rust. This makes things easier to reason about in Rust, but does give the optimizer less wiggle room.

https://en.cppreference.com/w/cpp/language/eval_order

How is evaluation order even relevant for RVO?
In code like `a(b(d),c(e))`, I think it could be relevant. You would want different code based on the size of `b(d)`, `c(e)`, `d`, and `e`. If you must evaluate b before c, that would eliminate some possible arrangements.

Specifically, if `e` and `b(d)` are huge, you probably would want to evaluate `c(e)` first and then `b(d)`.

There was a RFC for them, but it didn't get much traction.
"Memory moves to the stack frequently represent wasted computation. For the most part, they're CPU cycles that are spent shuffling data from one place to another instead of performing useful work. Stack-to-stack memory moves in particular are very likely to represent pure overhead; non-stack-to-stack memory moves are sometimes genuinely useful and necessary but frequently also represent waste."

It's essentially a critique that the optimizer is missing opportunities.

FTA:

> Why do we care about stack memory moves?

> Memory moves to the stack frequently represent wasted computation. For the most part, they're CPU cycles that are spent shuffling data from one place to another instead of performing useful work. Stack-to-stack memory moves in particular are very likely to represent pure overhead; non-stack-to-stack memory moves are sometimes genuinely useful and necessary but frequently also represent waste.

> Is there a trade off in being stack efficient and speed?

It's just rust being slightly less efficient: it spends instructions doing unnecessary stack-to-stack copies, and has larger stackframes (to hold the redundant copies, which can be an issue both with deep recursion and for inlining).