Hacker News new | ask | show | jobs
by lyricaljoke 2486 days ago
The borrower semantics is anything but simple.

I keep seeing people saying this when discussing Rust in relation to C++. It's not untrue, but I would counter by saying that people who were writing C++ without having a full understanding of ownership and lifetime semantics were writing buggy code. Rust just makes understanding those semantics required to get your code to compile.

3 comments

I have a somewhat related question -

I took CS 101 ~ 301 in college, and have worked briefly in C++ here and there during my career. I like to think I have a working understanding of pointers and references.

Despite that, whenever I go into a C++ codebase (recently, the Chromium codebase) I get _completely lost_ trying to figure out what the owner and lifetime of an object is, whether it's appropriate to pass by value, reference, or clone. I have absolutely no clue the impact of these decisions aside from a vague sense that pass-by-reference is more memory efficient and cloning is safest (When do we care? Is it something you can just pick one and forget about until performance matters?).

How do you start to pick up an intuition for manually working with memory this way? Is there a set of references that can take you past the beginner level? I find the C++ reference documentation completely opaque, and most search hits tend to be too domain specific to be useful.

Is this something that Rust would help with? Is their model easier to understand when each case is correct? Is it something that comes with language exposure? Or tooling?

Rust is a huge help with exactly this problem. The compiler will clearly tell you exactly where you've made a mistake if you give it code that violates the ownership rules, like trying to keep a reference past its valid lifetime.

It took me a little bit to get used to, but learning this definitely helped me learn to think about ownership and lifetimes better. I highly recommend you try learning some Rust.

I know C++ since Turbo C++ , have written C++ code at CERN, make use of static analysis tooling in C++, and yet it took me a couple of days to deal with Gtk-rs, trying to minimise the use of Rc<RefCell<>>.

While I did preserve to make it work, not everyone is willing to keep trying.

so how did you manage to not keep track of object lifetimes before? serious question, rust just forces you to do what c++ expects you to do, threatening undefined behavior at runtime if you don't. anecdotally i can say about myself that learning rust made me a better python programmer, of all things.
Modern c++ makes lifetimes pretty implicit.

Default assumption: all data lives in a value type, or an STL container. The destructor/copy constructors will automatically deal with lifetimes for you.

Fallback #1: the remainder lives in a std::unique_ptr, or perhaps std::shared_ptr. The destructor/copy constructors will deal with lifetimes for you.

Fallback #2: Take a moment to reflect upon the mistakes that you've made that have led you here. This is your fault. Write a class with correct constructor, destructor, copy constructor, move constructor, and move/copy assignment operators. This is roughly the equivalent of resorting to unsafe in rust in a place that makes calls to other rust code. (as opposed to a syscall or a c function call or somewhere else that it's obviously required) If you're here, it's because you fucked up.

Since c++11 had become available, I've eliminated all use of new/delete from new code that I write. I've slowly refactored old code (by myself and others) to do the same. The only significant attention I've paid to lifetimes is when I'm using a c library or am writing c# where half my objects are IDisposable. (yes, really, I do more manual memory management in a garbage collected language)

The problem with c++ isn't that you have to worry about lifetimes, it's that you used to have to worry about object lifetimes and many codebases were written during that time. Sure, we could rewrite the world in rust and all of that would go away, or we could rewrite the world in modern c++ and it would also go away.

I think you missed the point of Rust. It's not about replacing new/delete with RAII (even though that's a good idea for sure). How does modern C++ help you with tracking whether a pointer/reference outlives the lifetime of the pointee? Not at all.
By using the lifetime static analysis recently introduced in clang tidy and Visual C++ 2019.

It is still in the early stages and work is being done to make it a common feature across major compiler, while improving their capabilities.

RAII, STL data structures, smart pointers, warnings as errors, static analysis as errors, and refusing to write C in C++.
There are many other instances where ownership/lifetimes kick in in ways that would not be a problem in a different language.

Anywhere you have any kind of containment hiearchy, Rust starts becoming more invasive than other languages. In languages like Go, you can have complex graphs of objects referring to each other, and you can — often arguably safely — work with these structures without thinking about who owns what.

As an example of something that got me stuck, I recently had some code that populated a map of mutable buffers:

  struct Builder {
    buffers: BTreeMap<Term, RefCell<PostingBuffer>>,
  }
At the end of the building, it needs to flush the buffers, in key order, to a file and then empty the map so it can be reused. Turns out this is trickier than expected because the map owns its contents, so you can't take ownership of the RefCell that wraps the buffers:

  for (term, buf) in &self.buffers {
    // into_inner consumes buf, so it moves; fails with "Cannot move out of borrowed context"
    let data = buf.into_inner().get_data();
    w.write(term, data)?;
  }
  self.buffers = BTreeMap::new();
In the end, the trick was to replace the map for the iteration:

  for (term, v) in std::mem::replace(&mut self.buffers, BTreeMap::new()) {
    let data = v.into_inner().get_data();
    w.write(term, data)?;
  }
Initially I used a HashMap to optimize for build speed, then sorted its keys at the end. This was also a challenge, because maps apparently have no way of getting a copy of the keys by value. (There might be an easier way, but again, this just illustrates the learning curve for new developers.) In the end, I chose BTreeMap so the keys are already sorted.

This is all probably entirely obvious to Rust experts, but not so to new developers. Swapping out the entire map with std::mem::replace() would never have occurred to me. Even if you understand borrowing in principle, you have think about a container means in terms of borrowing, and how stuff will move in and out of a container. And you have to design the way you interact with them accordingly.

In principle, I think this awareness is a good thing, and that the resultant code will be smarter and more optimal as a result, but at the same time, it doesn't make for such an ergonomic developer experience; so far it's been a drain on my productivity more than a gain. I like to say that Rust "scales down" towards the low levels, but does less well "scaling up"; you can't write high-level code without also thinking about the low levels, when even things like the size of your data type is always in your face.

My solution above may not even be the most optimal, idiomatic way of doing things. I'm sure someone will come along and point out that there's a trick involving using something other than RefCell or whatever that simplifies everything.

Not necessarily a criticism of Rust, just a data point in terms of complexity and learning curve.

Fully acknowledge your final paragraphs here, for sure. But in case you're curious...

I think that if you wrote

  for (term, buff) in self.buffers {
it should have Just Worked. By iterating over a reference to self.buffers, you iterate over a reference to the contents, which you can't move out of. But iterating over it by value, it should give it to you by value, and you'd be fine. That said, I have not tried it, so there might be some context I'm missing. One nice thing about a strict compiler is that I have to think about the details less, and use the error messages to help me fix any problems I find. But that makes it harder to know how it goes without the compiler...
No, I tried that! That actually moves the "Cannot move" error to the "self.buffers" use itself.

The reason I can't just iterate by value is that I'd be consuming data that is already held by the map. The buffer is defined thus:

  use bitstream_io::{BigEndian, BitWriter};

  struct PostingBuffer<W: io::Write> {
    bw: BitWriter<W, BigEndian>,
    // ...
  }

  impl PostingBuffer {
    pub fn get_data(self) -> W {
      self.bw.into_writer()
    }
  }
bitstream_io::BitWriter's [1] into_writer consumes itself:

  pub fn into_writer(self) -> W {
    self.writer
  }
[1] https://docs.rs/bitstream-io/0.8.2/bitstream_io/write/struct...
Ah, ah, I see now. Yes, right.

This is exactly why I couched my last paragraph the way I did, heh.