Hacker News new | ask | show | jobs
by CJefferson 410 days ago
I had a look. In classic C++ style, if you use *x to get the ‘expected’ value, when it’s an error object (you forgot to check first and return the error), it’s undefined behaviour!

Messing up error handling isn’t hard to do, so putting undefined behaviour here feels very dangerous to me, but it is the C++ way.

4 comments

The reason it works this way is there's legitimately no easy way around it. You're not guaranteed a reasonable zero value for any type, so you can't do the slightly better Go thing (defined behavior but still wrong... Not great.) and you certainly can't do the Rust thing, because... There's no pattern matching. You can't conditionally enter a branch based on the presence of a value.

There really is no reasonable workaround here, the language needs to be amended to make this safe and ergonomic. They tried to be cheeky with some of the other APIs, like std::variant, but really the best you can do is chuck the conditional branch into a lambda (or other function-based implementation of visitors) and the ergonomics of that are pretty unimpressive.

Edit: but maybe fortune will change in the future, for anyone who still cares:

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p26...

You could assert. You could throw. I can’t understand how, this modern age where so many programs end up getting hacked, that introducing more UB seems like a good idea.

This is one odd the major reasons I switched to rust, just to escape spending my whole life worrying about bugs caused by UB.

Assertions are debug-only. Exceptions are usually not guaranteed to be available and much of the standard library doesn't require them. You could std::abort, and that's about it.

I think the issue is that this just isn't particularly good either. If you do that, then you can't catch it like an exception, but you also can't statically verify that it won't happen.

C++ needs less of both undefined behavior and runtime errors. It needs more compile-time errors. It needs pattern matching.

I agree these things would be better, but I don’t understand how anyone can think UB is better than abort.

(Going to moan for a bit, and I realise you aren’t responsible for the C++ standards mess!)

I have been hearing for about… 20 years now that UB gives compilers and tools the freedom to produce any error catching they like, but all it seems to have done in the main is give them the freedom to produce hard to debug crash code.

You can of course usually turn on some kind of “debug mode” in some compilers, but why not just enforce that as standard? Compilers would still be free to add a “standards non-compliant” go fast mode if they like.

> but why not just enforce that as standard

I don’t think people want that as standard. The whole point of using C++ tends to be because you can do whatever you need to for the sake of performance. The language is also heavily driven by firms that need extreme performance (because otherwise why not use a higher level language)

There are knobs like stdlib assertions and ubsan, but that’s opt-in because there’s a cost to it. Part of it is also the commitment to backwards compatibility and code that compiled before should generally compile now (though there are exceptions to that unofficial rule).

There does not need to be an additional cost for this.

Most users will do this:

1. Check if there is a value

2. Get the value

There is nothing theoretically preventing the compiler from enforcing that step 1 happens before step 2, especially if the compiler is able to combine the control flow branch with the process of conditionally getting the value. The practical issue is that there's no way to express this in C++ at all. The best you can do is the visitor pattern, which has horrible ergonomics and you can only hope it doesn't cause worse code generation too.

Some users want to do this:

1. Grab the value without checking to see if it's valid. They are sure it will be valid and can't or don't want to eat the cost of checking.

There is nothing theoretically preventing this from existing as a separate method.

I'm not a rust fanboy (seriously, check my GitHub @jchv and look at how much Rust I write, it's approximately zero) but Rust has this solved six ways through Sunday. It can do both of these cases just fine. The only caveat is that you have to wrap the latter case in an unsafe, but either way, you're not eating any costs you don't want to.

C++ can do this too. C++ has an active proposal for a feature that can fix this problem and make much more ergonomic std::variant possible, too.

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p26...

Of course, this is one single microcosm in the storied history of C++ failing to adequately address the problem of undefined behavior proliferating the language, so I don't have high hopes.

A lot UB is things you wouldn't do anyway. While it is possible to define divide by zero or integer overflow, what does it mean. If you code does either of those things you have a bug in your code (a few encryption algorithms depend on specific overflow behavior - if your language promises that same behavior it is useful).

Since CPUs handle such things differently whatever you define to happen means that the compiler as to insert a if to check on any CPU that doesn't work how you define it - all for something that you probably are not doing. The cost is too high in a tight loop when you know this won't even happen (but the compiler does not).

No

This is a bad answer too, IMO.

I think there is a solid case for the existence of undefined behavior; even Rust has it, it's nothing absurd in concept, and you do describe some reasoning for why it should probably exist.

However, and here's the real kicker, it really does not need to exist for this case. The real reason it exists for this case is due to increasingly glaring deficiencies in the C++ language, namely, again, the lack of any form of pattern matching for control flow. Because of this, there's no way for a library author, including the STL itself, to actually handle this situation succinctly.

Undefined behavior indeed should exist, but not for common cases like "oops, I didn't check to see if there was actually a value here before accessing it." Armed with a moderately sufficient programming language, the compiler can handle that. Undefined behavior should be more like "I know you (the compiler) can't know this is safe, but I already know that this unsafe thing I'm doing is actually correct, so don't generate safeguards for me; let what happens, happen." This is what modern programming languages aim to do. C++ does that for shit like basic arithmetic, and that's why we get to have the same fucking CVEs for 20+ years, over and over in an endless loop. "Just get better at programming" is a nice platitude, but it doesn't work. Even if it was possible for me to become absolutely perfect and simply just never make any mistakes ever (lol) it doesn't matter because there's no chance in hell you'll ever manage that across a meaningful segment of the industry, including the parts of the industry you depend on (like your OS, or cryptography libraries, and so on...)

And I don't think the issue is that the STL "doesn't care" about the possibility that you might accidentally do something that makes no sense. Seriously, take a look at the design of std::variant: it is pretty obvious that they wanted to design a "safe" union. In fact, what the hell would the point of designing another unsafe union be in the first place? So they go the other route. std::variant has getters that throw exceptions on bad accesses instead of undefined behavior. This is literally the exact same type of problem that std::expected has. std::expected is essentially just a special case of a type-safe union with exactly two possible values, an expected and unexpected value (though since std::variant is tagged off of types, there is the obvious caveat that std::expected isn't quite a subset of std::variant, since std::expected could have the same type for both the expected and unexpected values.)

So, what's wrong? Here's what's wrong. C++ Modules were first proposed in 2004[1]. C++20 finally introduced a version of modules and lo and behold, they mostly suck[2] and mostly aren't used by anyone (Seriously: they're not even fully supported by CMake right now.) Andrei Alexandrescu has been talking about std::expected since at least 2018[3] and it just now finally managed to get into the standard in C++23, and god knows if anyone will ever actually use it. And finally, pattern matching was originally proposed by none other than Bjarne himself (and Gabriel Dos Reis) in 2019[4] and who knows when it will make it into the standard. (I hope soon enough so it can be adopted before the heat death of the Universe, but I think that's only if we get exceptionally lucky.)

Now I'm not saying that adding new and bold features to a language as old and complex as C++ could possibly ever be easy or quick, but the pace that C++ evolves at is sometimes so slow that it's hard to come to any conclusion other than that the C++ standard and the process behind it is simply broken. It's just that simple. I don't care what changes it would take to get things moving more efficiently: it's not my job to figure that out. It doesn't matter why, either. The point is, at the end of the day, it can't take this long for features to land just for them to wind up not even being very good, and there are plenty of other programming languages that have done better with less resources.

I think it's obvious at this point that C++ will never get a handle on all of the undefined behavior; they've just introduced far too much undefined behavior all throughout the language and standard library in ways that are going to be hard to fix, especially while maintaining backwards compatibility. It should go without saying that a meaningful "safe" subset of C++ that can guarantee safety from memory errors, concurrency errors or most types of undefined behavior is simply never going to happen. Ever. It's not that it isn't possible to do, or that it's not worth doing, it's that C++ won't. (And yes, I'm aware of the attempts at this; they didn't work.)

The uncontrolled proliferation of undefined behavior is ultimately what is killing C++, and a lot of very trivial cases could be avoided, if only the language was capable of it, but it's not.

[1]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n17...

[2]: https://vector-of-bool.github.io/2019/01/27/modules-doa.html

[3]: https://www.youtube.com/watch?v=PH4WBuE1BHI

[4]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p13...

Culturally, I think C++ has a policy of "there's no single right answer." Which leads to there being no wrong answers. We just need more answers so everyone's happy. Which is worse.
Abort would be fine here. Operator* on expected is intended to be used when you have already verified the result wasn’t error.
Of course you can do the Rust thing, it's just taking a function object.
`StatusOr<T>::operator` there is akin to `Result<T, _>::unwrap()`. On C++ unwrapping looks like dereferencing a pointer which is scary and likely UB already.

But as you learn to work with StatusOr you'll end up just using just ASSIGN_OR_RETURN everytime and dereferencing remains scary. I guess the complaint is that C++ won't guarantee that the execution will stop, but that's the C++ way after you drop all safety checks in `StatusOr::operator` to gain performance.

This is the idiomatic way in C++. I'm not even sure what your proposed alternative is -- as other commenters have noted, an exception or "panic" are not actual options.

Every pointer dereference, array access, and even integer truncation is UB in C++. This isn't rust.

A static analyzer can and does catch these errors and others internally. Typical usage of StatusOr is via macros like ASSIGN_OR_RETURN and RETURN_IF_ERROR; actually using the * operator would definitely draw my attention in code review.

Very similar footgun on std::optional::operator*. The big C++ libraries do at least have (debug-only) assertions on misuse.