Hacker News new | ask | show | jobs
by haberman 2584 days ago
Coming from C++, this always throws me a bit:

    if (0..=10).contains(&5) {
I assume we are taking a ref to "5" (and not passing by value) because "Range" is a generic type that might be too big to want to copy (or it may not be copyable at all). But taking a reference to a number for a simple operation like this feels... weird. It makes me worry that Rust is going to be passing around pointers to some stack-allocated "5", but I hope that Rust is actually much smarter than that?
6 comments

Part of the problem is conceptualizing this as a reference (which, admittedly, it is without optimization).

Instead, this is a borrow which indicates that the callee will not mutate the argument. It makes more sense in this context.

> Part of the problem is conceptualizing this as a reference (which, admittedly, it is without optimization).

Isn't that also the official name of the feature in Rust? Isn't &T in Rust pronounced "reference to T" and "&mut T" pronounced "mutable reference to T"? That is the impression I get from: https://doc.rust-lang.org/book/ch04-02-references-and-borrow...

It is a reference, yes, and without optimization, is a pointer to a value.

Your parent is suggesting to not think of it in such a low-level way, and instead think of it as a permission. In this case, the pointer being optimized away makes more sense, as it’s not really about it being a pointer.

I’m of two minds about it, to be honest.

Thanks for the book! A++ would read again.

https://www.amazon.com/Rust-Programming-Language-Steve-Klabn...

You're welcome!
If you had a Range<BigNum> then it would make a lot of sense. I can’t remember if Range<T>’s constraints on T would support that, though.
AFAIK the only constraint on `Range::contains` is that the `Idx` type conforms to `PartialOrd`. So you could in fact have a `Range<String>` if you so desired.
Rust doesn't have a syntax distinction for passing by reference vs passing by value (which may sound terrible to a C++ programmer, but Rust has no copy constructors, and types are non-copyable by default, so nothing expensive gets copied by surprise). For example, in the C sense, `Box<u8>` is a pointer, and `&str` is a struct passed by value.

Rust is all about ownership, and `&` means "don't free() this".

what steveklabnik said. if I understand correctly, this is dealing with that 5 as if integers were a generic type. which is the expected "default" behavior (as in compiler design, not programmer daily use). a generic type would be stored on the heap, and references there make sense to avoid copies. what happens here is that integers are special because they are stored in the stack, they are immutable, and always copied, but this special behavior, which makes the reference "absurd" here, hasn't been dealt with (yet).

so yeah, it looks super weird, but when you stop to think about it, it's just that we are so used to integers being "special" that when we see them treated like a generic type it makes us cringe. interesting.

This is incorrect. In general, a reference can point to either the stack or the heap, including when generics are used. It is possible in theory for the Rust compiler to implement an ABI optimization where function arguments that are references to small values would be passed by value instead, at least in some cases (see my other post). However, that optimization does not currently exist.

Edit: But in this case the reference will be optimized away anyway, because the callee function will be inlined.

Edit 2: Actually, the optimization does exist at the LLVM level, though it only applies in some cases. See my other comment:

https://news.ycombinator.com/item?id=19997818

Perhaps "special" here means "implements the std::marker::Copy trait"?

Or is there something else going on?

The optimizer should take care of that, yeah. The ABI isn’t defined so it can re-write stuff. And inlining, all that fun stuff.
Will the calling convention pass this by value even if it isn't inlined? My hope is that the ABI would systematically decide: "passing a const ref to a type that fits in a machine word is silly, so we never do that."
I think so, but am not sure. The spec answer is “who knows” because it’s not defined, but practically I’m not 100% sure if the compiler does it always today.
No, it never does that optimization:

https://play.rust-lang.org/?version=stable&mode=release&edit...

(see assembly output)

Edit 2: I lied: Rust doesn't perform such an optimization but LLVM does, -argpromote. But it only works if the callee is compiled in the same LLVM module as the caller (or with non-thin LTO), and is not visible outside that module. And since it has to respect pointer identity, it only works in a subset of cases.

Original post:

It arguably cannot, because you can cast the reference to a raw pointer and compare it to other pointers, though I don't think there's been a proper discussion on whether or not references are guaranteed to preserve pointer identity.

Edit 1: However, Rust's compilation model does theoretically allow the compiler to modify a function's ABI based on its implementation, at least in some cases, so it could theoretically perform the optimization only when calling functions which it knows don't care about pointer identity. That would avoid violating the aforementioned guarantee that may or may not exist, but it would be less reliable, as the compiler's analysis would inevitably lose track of some pointer values and treat them as escaping, thus potentially identity-sensitive, when they're actually not.

It's because the std::ops::RangeInclusive struct is meant to be much more generic than just ranges over machine-sized integers. I think a generic range struct in C++ would similarly take references.
Wouldn't it be possible to allow passing by value (i.e. moving) if a reference is expected? If I don't need my value afterwards or the type is `Copy`, it shouldn't be a problem, right? Of course, the function would still receive a reference, only at the call site or wouldn't be visible. A clear gain for ergonomy.
I wonder if the API could be expanded to accept any type that implements `Borrow<T>`, like `HashMap::get`. Then both `5` and `&5` could work.