Hacker News new | ask | show | jobs
by wallstop 3997 days ago
This post horrified me. This may be due to two factors: (1) having not written any serious C++ in the past year and (2) a focus on "correct" or "easily verifiable" code.

I can understand the urge to want to have the speed and power of move, but it feels like you're destructively working around the standard. The proposed API in SetStringer seems very dangerous; you're passing back a const ref to a string that isn't really const at all (successive calls after the set has been updated will mutate it's state, users may be unaware of this).

Move, copy, and (const) references each have their place and should be used accordingly. Move when you want a transfer of ownership, copy when you want duplication (at this instance) of ownership to another owner, ref when you want shared ownership. The intent of the article seems to actively go against these intents; stinos' comment sums up the rest of my feelings perfectly.

3 comments

> The proposed API in SetStringer seems very dangerous; you're passing back a const ref to a string that isn't really const at all (successive calls after the set has been updated will mutate it's state, users may be unaware of this).

An accessor returning a const reference is perfectly normal. Something being const in C++ never meant that other things can't modify it, and the risk you describe has nothing to do with it being a const reference, as opposed to the non-const kind. For example, one given by the article, in fact, you get the same situation with an even bigger risk profile with the expression v[i], where v is a std::vector<std::string>.

> ref when you want shared ownership.

Not at all. If you have a reference to something, you don't own it.

You misunderstand the point of reference-to-const in C++. It is just a const view of an object, it makes no promises about the object itself. For functions returning a reference-to-const, the value of the contract is for the function, not the caller. For functions accepting reference-to-const parameters, the value of the contract is for the caller, not the function.
Your horror is completely justified: this is not correct C++. Accessing an object after it has been move()d it squarely undefined behaviour
It is not undefined behaviour. The object is left in a valid, but unspecified state. If you .clear() it or whatever (in case of a vector), you'll have a perfectly good object to use.
You're right; s/undefined/unspecified/.

The comment on the article still stands though. You're still invoking unspecified behaviour, and this is a horrible practice - it forces you to take not on which classes you are sure of the actual behaviour when accessing after move, and you're bound to slip.

(Note that the article itself uncorrectly says that move leaves string in an invalid state, which is what threw me off track)

Actually, that's not true. Here's what I wrote:

"By stealing the string, we put foo in an invalid state".

foo is an object of the SetStringer class. It puts it an in invalid state because it no longer maintains its own invariants.

The whole point of this article is about how you can move member variables for efficiency, while ensuring the owning object is left in a valid state (as move/rvalue semantics require).

Wanting to be able to move a member variable is no stranger than wanting to move the object itself; in that sense an rvalue ref overloaded getter is no stranger than move construction/assignment, it's just more rare because it's less used.

That's my main point of disagreement: wanting to move a member variable is more strange than moving the object itself. Consider your use case: you have some cached string that you may or may not want to move out. If you move it, the next time you access "it", the string needs to be re-computed. This means the string is going to re-malloc the memory that it had before it was moved. I don't understand what "efficiency" the outlined code gives you.

While I am not a C++ day-developer nor guru, from what I have seen and written of modern C++, the "move member variables" is not "good", in any sense of the word, stemming from the reasons put forth by many a comment.

If you ever want to move a member variable, I would highly recommend that you re-think your design. Ask questions such as "why do I want this?" and "is it really more efficient?". In some cases, where perf is really that important, then yea, go for it. However, just because something can be done, does not mean that it should be done.

While I still disagree with your const & return of the original string, I threw together an example showing the same exact API you proposed, but without moving member variables:

https://gist.github.com/wallstop/4d80fba9b1d15da64488

Depending on method usage, this may or may not invoke the cost of an extra string creation (+ extra on move method, - for continual access of cached string)

His example is perfectly fine. The example that puts the object in an invalid state is an example of what not to do. The example of what you should do, which is to make the method marked "&&" so it can only be used on rvalue reference, i.e. an object that's going to be discarded anyway, makes a method that leaves the object in a valid state and one that conforms perfectly to expectations. It even politely cleans up m_set.

Generally speaking, there's nothing wrong at all with moving from a member variable. You can see an example of where it's perfectly all right in this article.

> you have some cached string that you may or may not want to move out. If you move it, the next time you access "it", the string needs to be re-computed. This means the string is going to re-malloc the memory that it had before it was moved. I don't understand what "efficiency" the outlined code gives you.

The method in question can only be called on an rvalue reference, which means the object, in its present state, is not going to be used again in the future anyway, so the problem of reallocating the string is moot.

> https://gist.github.com/wallstop/4d80fba9b1d15da64488

Your code has a use-after-free bug, because your std::string&& str() && function returns a local variable by reference. Your 'std::string&& str() &&' function doesn't even destructively modify the object, so there's no reason for it to be marked '&&' in the first place, and there's no reason for it to exist at all, because its performance is worse than the 'const std::string & str() &' version.

So, in retrospect, my example was not a good one, I'm afraid. the recomputing of the string was a red herring. All that matters is that the moved-from object was in a valid state. You can look at the updated post, maybe it will help a bit.

If you want a concrete example, consider std::stringstream. You agree this class is useful, right? In many cases, you build up a string, bit by bit. You then want to extract the string at the end. Because stringstream was written before move semantics, that string is returned by value. That means a full copy of the internal string buffer is made.

In 95% of real life use cases of stringstream, the stringstream is a local variable used to build up a string and then discarded. So copying that string is fundamentally a waste. But exposing the string buffer in a way that std::move could be fruitfully called on it is dangerous, it may break stringstream's class invariants and turn it into a toxic object, this is avoided in C++. So the preferred solution would be to give stringstream::str an rvalue overload, so that the stringstream can safely release its string buffer. Does that make sense?

Of course I am aware that this code involves extra work, is harder for less experienced team members to understand, etc. I wouldn't do it in the vast majority of situations. But sometimes it is useful. The goal was to try to help people understand &&/& member overloads, as there isn't much discussion of them.

Respectfully, I am a professional c++ developer. Odds are that the code I write is under greater pressure for performance and genericity. Maybe a bit of benefit of the doubt is in order.

I think concentrating on the move part is the wrong way to look at rvalue-ref methods. The advantage is that if the compiler knows an object is a temporary it you can safely let the caller safely "steal" a member object.

  #include <string>
  #include <stdio.h>

  struct A {
    const std::string& str() const & { return m_str; }
    std::string&& str() && { return std::move(m_str); }

    std::string m_str;
  };

  void func(const std::string& s) { puts("Got const value"); }
  void func(std::string&& s) { puts("Got rvalue"); }

  A returns_a() { A a; return a; }

  int main()
  {
    func(returns_a().str());
    A b;
    func(b.str());
    return 0;
  }
The standard defines preconditions when talking about what operations on valid on its containers. std::string::clear() has no preconditions so it is valid to call on a moved from object and then you can be sure of its state.