Hacker News new | ask | show | jobs
by Arnavion 294 days ago
Rust's behavior of moving without leaving a moved-out shell behind also simplifies the implementation of the type itself, because its dtor doesn't have to handle the special case of a moved-out shell, and the type doesn't even need to be able to represent a moved-out shell.

For example, a moved-out-from tree in C++ could represent this by having its inner root pointer be nullptr, and then its dtor would have to check for the root being nullptr, and all its member fns would have the danger of UB (nullptr dereference) if the caller called them on a moved-out shell. But the Rust version could use a non-nullable pointer type (Box), and its dtor and member fns would be guaranteed to act on a valid pointer.

3 comments

This was one of the most unsatisfying things about learning C++ move semantics. They only kinda move the thing, leaving this shell behind is a nightmare.
C++ doesn't have ownership baked into the language like Rust does, and "move semantics" is all about ownership (under the hood it's just a plain old shallow copy both in C++ and Rust). Making the moved from object inaccessible like in Rust would have required static ownership tracking which I guess the C++ committee was afraid to commit to (and once you have that, you're basically halfway to Rust, including the downside of a more restrictive programming model).
> Making the moved from object inaccessible like in Rust would have required static ownership tracking which I guess the C++ committee was afraid to commit to (...)

I'm not sure the "afraid to commit to" is a valid interpretation. The requirements that the C++ standard specifies for moved-from objects turns that hypothetical issue into a non-problem. In C++, if you move an object then after the move the object must be left in a valid state. That's it. This means the object can be safely destroyed.

You are also free to implement whatever semantics your moved-from object has. If you want your moved-from object to throw an exception, you are free to implement that. If instead you want to ensure your moved-from can be reused you are also free to do so. If you want to support zombie objects then nothing prevents you from going that path. It's up to you. The only thing the standard specifies is that once the lifetime of that object ends, it can be safely destroyed. That sounds both obvious and elegant, don't you agree?

You'd have to mark some functions as deleting their arguments. But I wouldn't really call that ownership. And it shouldn't restrict the language: If the compiler can't solve it statically then it can set a flag or null and check it before calling the destructor. Instead of a guard being built into every destructor use.
> This was one of the most unsatisfying things about learning C++ move semantics. They only kinda move the thing, leaving this shell behind is a nightmare.

I don't know what nightmares you have. The only requirement that C++ specifies for moved-from objects is that they remain valid. Meaning, they can be safely destroyed.

You can go way out of your way and reuse an object that was just moved, but that's a decision you somehow made, and you have the responsibility of adding your reinitialization or even move logic to get that object back in shape. That is hardly something that sneaks up on you.

Since I use move semantics all the time, this is for me the most frustrating thing about C++ full stop. I really wish they'd fix this instead of adding all those compile-time features.
> Since I use move semantics all the time (...)

Everyone who ever uses C++ uses move semantics all the time,including move elision. It's not an obscure feature.

> (...) this is for me the most frustrating thing about C++ full stop.

I've been using C++ for years and I have no idea what you could be possibly referring to. The hardest aspect of move semantics is basically the rule of 5. From that point, when you write a class you have the responsibility to specify how you want your class to be moved and how you want your moved-from class to look like, provided that you ensure you leave it in a valid state.

That's it.

What exactly do you believe needs fixing?

How would you fix this in C++?
By adding syntax and semantics for destructible moves, meaning the moved object is removed from its scope (without calling its destructor.)
I've worked with C++ for a number of years, with a few codebases that were >1M LoC. Never did I stumbled upon a situation where an object was moved and an existing symbol became a problem. I wonder what you are doing to get yourself in that situation.
> I wonder what you are doing to get yourself in that situation.

The problem with the current move semantics is that, compared to e.g. Rust: 1) the compiler generates unnecessary code and 2) instead of just implementing class T you must implement a kind of optional<T>.

Which means, that after all those years of using smart pointers I find myself ditching them in favor of plain pointers like we did in the 90's.

When I looked into the history of the C++ move (which after all didn't even exist in C++ 98 when the language was first standardized) I discovered that in fact they knew nobody wants this semantic. The proposal paper doesn't even try to hide that what programmers want is the destructive move (the thing Rust has) but it argues that was too hard to do with the existing C++ design so...

The more unfortunate, perhaps disingenuous part is that the proposal paper tries to pretend you can make the destructive move later if you need it once you've got their C++ move.

But actually what they're proposing is that "move + create" + "destroy" = "move". So, that's extra work it's not the same thing at all and sure enough in the real world this means extra work, from compilers, from programmers and sometimes (if it isn't removed by the optimiser) from the runtime program.

C++ is riddled with “good enough” without completeness. Resulting in more bandaids to the language to fix stuff they half implemented in the first place.
> When I looked into the history of the C++ move (which after all didn't even exist in C++ 98 when the language was first standardized) I discovered that in fact they knew nobody wants this semantic. The proposal paper doesn't even try to hide that what programmers want is the destructive move (the thing Rust has) but it argues that was too hard to do with the existing C++ design so...

> The more unfortunate, perhaps disingenuous part is that the proposal paper tries to pretend you can make the destructive move later if you need it once you've got their C++ move.

For reference, I think N1377 is the original move proposal [0]. Quoting from that:

> Alternative move designs

> Destructive move semantics

> There is significant desire among C++ programmers for what we call destructive move semantics. This is similar to that outlined above, but the source object is left destructed instead of in a valid constructed state. The biggest advantage of a destructive move constructor is that one can program such an operation for a class that does not have a valid resourceless state. For example, the simple string class that always holds at least a one character buffer could have a destructive move constructor. One simply transfers the pointer to the data buffer to the new object and declares the source destructed. This has an initial appeal both in simplicity and efficiency. The simplicity appeal is short lived however.

> When dealing with class hierarchies, destructive move semantics becomes problematic. If you move the base first, then the source has a constructed derived part and a destructed base part. If you move the derived part first then the target has a constructed derived part and a not-yet-constructed base part. Neither option seems viable. Several solutions to this dilemma have been explored.

<snip>

> In the end, we simply gave up on this as too much pain for not enough gain. However the current proposal does not prohibit destructive move semantics in the future. It could be done in addition to the non-destructive move semantics outlined in this proposal should someone wish to carry that torch.

[0]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n13...

Now that would be a cool first proposal and implementation. I wonder if there’s any prior art in C++ yet.
If there is any prior art I'm not aware of it. The problems described in the part I snipped out around how destructive moves would work with class hierarchies sound thorny, for what it's worth.
Destructive vs non-destructive move.
> For example, a moved-out-from tree in C++ could represent this by having its inner root pointer be nullptr, and then its dtor would have to check for the root being nullptr,

delete null is fine in C++ [1], so, assuming root either is a C++ object or a C type without members that point to data that also must be freed, its destructor can do delete root. And those assumptions would hold in ‘normal’ C++ code.

[1] https://en.cppreference.com/w/cpp/language/delete.html: “If ptr is a null pointer value, no destructors are called, and the deallocation function may or may not be called (it's unspecified), but the default deallocation functions are guaranteed to do nothing when passed a null pointer.”

In practice, move operations typically just leave an empty object behind. The destructor already has to deal with that. And of course you can't call certain methods on an empty object. So in practice you don't need special logic except for the move operations themselves.
> The destructor already has to deal with that.

That's partly true, partly circular. Because moves work this way, it's harder to make a class that doesn't have empty states, so I don't design my class to avoid empty states, so the destructor has to handle them.

> That's partly true, partly circular.

I don't think there is anything "partly" about it being true. A moved-from object is expected to remain valid and preserve class invariants. If you wrote a class whose objects fails to remain valid after being moved,you wrote bugs into your code.

> Because moves work this way, it's harder to make a class that doesn't have empty states, so I don't design my class to avoid empty states, so the destructor has to handle them.

You are not required to implement an empty state. You are only required to write your classes so that after moving an object it remains valid. You are free to specify what this means to your classes, and can be anything from leaving the object as if it was default initialized or have literally a member variable such as bool moved. It's up to you. In C++'s perspective as long as your moved-from object can be safely destroyed them it's all good. Anything else is the behavior you chose to have, and bugs you introduced.

It's not like it's the only part of the language that mandates a default constructor though. There are plenty of situations where default-constructible types are desirable. Even simple things like having a non-default-constructible type in a map is awkward.
> It's not like it's the only part of the language that mandates a default constructor though

It’s… not a part of the langage which mandates a default ctor in the first place.

> It’s… not a part of the langage which mandates a default ctor in the first place.

Why should it, tough? Think about it. The goal of move semantics is performance, mainly avoiding to copy/initialize expensive objects using a standard syntax. Why do you believe it would be a good idea to force constructors when they can very well be the reason why move should be used?

Did you reply to the wrong comment?
It doesn't, but it does mandate that the object has some "empty state". If you have an empty state you might as well have a default constructor which initializes the object to that empty state.
moved-from objects are not in an empty state but in an unspecified state, they are only required to be destructible, every other operation can be disallowed. That is not a useful state for default construction. Thus being movable does not imply defaulting is any sort of good idea.

The other way around makes more sense, but even then it is not systematic, if default construction is costly (allocation, syscall, …) then you don’t want to do that for a moved-from object which will just be destroyed, which is the fate of most.

Please give me an example for a class that needs to handle empty state in the destructor only because of move operations. These exist, but IME they are very rare. As soon as you have a default constructor, the destructor needs to handle the case of empty state.
It’s not just the destructor you have to worry about, it’s all of the state accessible to callers.

If you have any type that represents validated data, say a string wrapper which conveys (say) a valid customer address, how do you empty it out?

You could turn it into an empty string, but now that .street() method has to return an optional value, which defeats the purpose of your type representing validated data in the first place.

The moved-from value has to be valid after move (all of its invariants need to hold), which means you can’t express invariants unless they can survive a move.

It is much better for the language to simply zap the moved-from value out of existence so that you don’t have to deal with any of that.

First, one shouldn't use a moved-from object in the first place (except for, maybe, reassigning it).

Second, why can't the .street() method simply return an empty string in this case?

> The moved-from value has to be valid after move (all of its invariants need to hold)

The full quote from the C++ standard is: "Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state" AFAIK, it only makes such requirements for standard library types, but not for user defined types. Please correct me if I'm wrong.

> First, one shouldn't use a moved-from object in the first place (except for, maybe, reassigning it).

It still requires you to come up with somethkng to do to the old value in the move constructor. What would you do in the ValidatedAddress case? Set a flag in the struct called “moved_from” and use that to throw an exception if it’s ever used? Wouldn’t it be nice if you just didn’t need to worry about it?

> Second, why can't the .street() method simply return an empty string in this case?

In this example I’m referring to a type that represents a “validated” address, so, one that has already passed checks to make sure the street isn’t empty, etc. (it’s the whole “parse, don’t validate” idea, although I’ve never understood why the word “parse” is used when I would’ve just called it “validate just once”.)

It is an extremely useful concept for your type system to represents invariants in your data like this. Having to make every type contain an “empty” case, just to make the language’s move semantics work, pokes an enormous hole through this idea.

> AFAIK, it only makes such requirements for standard library types, but not for user defined types

It makes the requirement because the compiler is not going to stop anyone from using the moved-from value, so you have to think of something to do in the move constructor. You can pinky-swear to never use the moved-from value in your own code (and linters can help here) but the possibility still exists, so it must be solved for.

This means C++ is riddled with types that have unrelated "I'm empty" state inside them rather than this being relegated to a separate wrapper type. It's Tony's Billion Dollar Mistake but smeared across an entire ecosystem.

The smart pointer std::unique_ptr<T> is an example of this, sometimes people will say it's basically a boxed T, so analogous to Rust's Box<T> but it isn't quite, it's actually equivalent to Option<Box<T>>. And if we don't want to allow None? Too bad, you can't express that in C++

But you're right that C++ people soldier on, there aren't many C++ types where this nonsense unavoidably gets in your face. std::variant's magic valueless_by_exception is such an example and it's not at all uncommon for C++ people to just pretend it can't happen rather than take it square on.

> This means C++ is riddled with types that have unrelated "I'm empty" state

Again, these cases are still rare. Most classes either don't require user-defined move operations, or they have some notion of emptiness or default state.

> And if we don't want to allow None? Too bad, you can't express that in C++

That's actually a good example! Nitpick: you can express it in C++, just not without additional logic and some overhead :)

>you can express it in C++, just not without additional logic and some overhead :)

How?

(And that difference leads to an ABI difference that makes it not a zero overhead abstraction in the way that Box is…)
Great point! Chandler Carruth explained this in one of this cppcon talks: https://youtu.be/rHIkrotSwcc?t=1047
A socket.
How so? Doesn't your socket class have a default constructor and a notion of open and closed?
If the moves were destructive, I'd design it to have the default constructor call `::socket` and destructor call `::close`. And there wouldn't be any kind of "closed" state. Why would I want it?