Hacker News new | ask | show | jobs
by lanstin 1312 days ago
That seems like a potential for compiler optimization. It should already know that the rule value is only used one time, as the target of a & and this must be somewhat common in managing return values.
4 comments

The semantics change. You're now returning a pointer to the actual Rule in the Ruleset, while before you'd be returning a pointer to copy of the Rule.

The optimization would only work if you had a way to tell the compiler that some values are constant/immutable.

BTW; I'm using both Go and Rust lately.

In Rust you can write a function that returns the pointer of one element of a slice. You can also write a function that returns the pointer to a heap-allocated copy of an element of the slice. The two functions would have different signatures.

The compiler would also prevent mutation of the slice as long as there are any references to individual elements of the slice being passed around.

>In Rust you can write a function that returns the pointer of one element of a slice.

Have fun fighting the borrow checker on that one.

Now try to actually do that in a larger program without static lifetimes...
It shouldn't be too bad as long as you keep in mind that it is a reference to an item in that slice, so whatever that slice is pointing to needs to stick around as long as you're using elements from it. I don't often encounter borrow checker issues anymore, because once you program Rust long enough you know what things need to live for what lifetimes, and you architect your "larger programs" around that.
Yes and that's the whole point!

The borrow checker ensures that you either do it right or you find another way that is safe.

For example it may be totally fine to help allocate a result, not all programs need to be optimized to the bone. Just return a boxed value. If you need to share it, wrap it in an Rc or Arc.

One problem I see with it is that rust chooses to make the most efficient way of programming also the most "simple looking", and so it lines up incentives in such a way that people will unnecessarily try to avoid using an extra Arc just because it looks like unnecessary clutter.

When you learn to embrace your boxing you'll learn that the borrow checker is not your enemy.

Oh yeah. Too bad. Just ran the escape to heap analysis on my current project, not looking too promising. Mostly it is allocating structure and saving them in a huge in memory hash.
I think the optimization is only valid if we know that nothing is ever going to use thr returned pointer to do mutation.
I don't think it can be optimized without altering semantics. If it's a pointer to a value in the slice, changing the slice's values (ruleset[i] = ...) will be reflected in all Rule values returned from the function, because they all point to the same memory. In the same way, changing the returned value's fields will change the data in the original slice. The author's code is prone to this behavior after the change.

When it's a pointer to a copy, no such implicit dependencies occur.

You're not wrong in general, but one interesting thing about Go as an ecosystem (rather than as a language) is that golang programs are mostly statically compiled — all sources, one pass, one code unit, one object-code output — so they're (theoretically) very amenable to compile-time (rather than link-time) Whole-Program Optimization techniques.

In this specific case, that technique would be whole-program dataflow analysis. Given a Golang function that passes out references-to-copies-of owned data, you could actually determine for certain — at least in the default static-binary linkage mode — whether these two properties hold universally within the resulting binary:

1. whether no caller of the function will ever try to do anything that would cause data within their copy of the struct to be modified;

2. whether the owner of the data will never modify the data of the original struct in such a way that, if the copy were elided, the changes would be "seen" by any reads done in any of the callers. (The owner could still modify internal metadata within the struct for its own use, as long as such internal metadata is 1. in private fields, 2. where all callers live outside the package defining the struct, making those fields inaccessible; and 3. the fields are never accessed by any struct methods called by borrowers of the struct — keeping in mind that such methods can be defined outside the package by caller code.)

If you could prove both of these properties (using dataflow analysis), then you could safely elide the copy within the function, turning the return of a reference-to-a-copy-of-X into a return of a reference-to-X.

(And, in fact, if you can only prove the second property universally, and the first property in specific instances, then you can still elide the copy from the function itself; but you'd also generate a wrapper function that calls said function [receiving a reference-to-X], copies, and so returns a reference-to-a-copy-of-X; and then, for any call-site where the first property doesn't hold — i.e. callers whose transitive call-graph will ever modify the data — you'd replace the call to the original function with a call to the wrapper. So "safe" connected caller sub-graphs would receive references, while "unsafe" connected caller sub-graphs would receive copies.)

Nah an optimization is dangerous as others have said. A lint that detects oversized copies could be worthwhile though.