Hacker News new | ask | show | jobs
by virtualritz 561 days ago
> "Unspecified" means that the implementation gets to decide what is the status of the string after move. For long enough strings, typical implementation strategy is to set the moved-from string in an empty state.

Thusly, what happens in code that accesses the string after the move is UB.

In the implementation of C++ the article uses the string was just empty. But for all we know it may still contain a 1:1 copy of the original or 20 copies or a gobbledygook of bytes.

Any code that relies on the string being something (even empty) may behave different if it isn't. That's the very definition of UB.

"A typical implementation strategy" is meaningless for someone writing code against a language specification.

You're then writing code against a specific compiler/std lib and that's fine. But let's be honest about it.

2 comments

That's not what UB means. "This will behave differently on different implementations" is implementation defined behavior. Compilers are not allowed to assume that implementation defined behavior never occurs or reject your program if they can prove that it happens.

Undefined behavior is a stronger statement and says that if the behavior occurs then the entire program is simply not valid. This allows the compiler to make vastly more aggressive changes to your program.

There is nothing in the standard or definition of C++ that states that undefined behavior renders a program invalid.

On the contrary the actual C++ standard explicitly states that permissible undefined behavior includes, and I quote "behaving during translation or program execution in a documented manner characteristic of the environment".

It's also worth noting that numerous well known and used C++ libraries explicitly make use of undefined behavior, including boost, Folly, Qt. Furthermore, as weird and ironic as this sounds, implementing cryptographic libraries is not possible without undefined behavior.

"valid program" is not really a term that is used in the standard (I only count one normative usage). What the standard does say is:

"A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this document places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)."

I.e. a program the contains UB is undefined.

Of course, as you observer, an implementation can go beyond the standard and extend the abstract machine to give defined semantics to those undefined operations.

That's still different from implementation defined behaviour, where a conforming implementation must give defined semantics.

> Thusly, what happens in code that accesses the string after the move is UB.

No, it is implementation-defined behaviour.

> In the implementation of C++ the article uses the string was just empty. But for all we know it may still contain a 1:1 copy of the original or 20 copies or a gobbledygook of bytes.

Yes, and if you want to make sure that the string is empty before you do something else with it, you just use a clear() (which will be optimised away by the compiler anyway).

Or, if you prefer, you can assign another string to it, or anything else really.

> Any code that relies on the string being something (even empty) may behave different if it isn't. That's the very definition of UB.

No it is not.

> "A typical implementation strategy" is meaningless for someone writing code against a language specification.

Then don't rely on that specific implementation detail and make sure that the string is in the state you want or, even better, don't touch the moved-from string ever again.