| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dataflow 530 days ago

Thanks for the 1-hour video. Could you link to the timestamp of the strongest argument(s) you see in the video that are relevant in the current discussion (i.e. the existing error models we're talking about in Rust and C++, rather than a hypothetical future one)?

Just from a quick glance: I see he's talking about things like stack overflows and std::bad_alloc. In a discussion like this, those two are probably the worst examples of exceptions. They're the most severe exceptions, and the one the fewest people care to actually catch, and the ones that error codes are possibly the worst at handling anyway. (Do you really want an error returned from push_back?) The most common stuff is I/O errors, permission errors, format errors, etc. which aren't well represented by resource exhaustion at all, much less memory exhaustion.

P.S. W.r.t. "the top C++ gurus/leaders" - Herb is certainly talented, but I should note that the folks who wrote Google's style guide are... not amateurs. They have been involved in the language development and standardization process too. And they're just as well aware of the benefits and footguns as anyone.

3 comments

dwattttt 530 days ago

The general problem cited with exceptions is that they're un-obvious control flow. The impact it has is clearer in Rust, because of the higher bar it sets for safety/correctness.

As a specific example, and this is something that's been a problem in the std lib before. When you code something that needs to maintain an invariant, e.g. a length field for an unsafe operation, that invariant has to be upheld on every path out of your function.

In the absence of exceptions, you just need to make sure your length is correct on returns from your function.

With exceptions, exits from your function are now any function call that could raise an exception; this is way harder to deal with in the general case. You can add one exception handler to your function, but it needs to deal with fixing up your invariant wherever the exception occurred (e.g. of the fix-up operation that needs to happen is different based on where in your function the exception occurred).

To avoid that you can wrap every call that can cause an exception so you can do the specific cleanup that needs to happen at that point in the function... But at that point what's the benefit of exceptions?

link

dataflow 530 days ago

> With exceptions, exits from your function are now any function call that could raise an exception; this is way harder to deal with in the general case. You can add one exception handler to your function [...] To avoid that you can wrap every call [...]

That's the wrong way to handle this though. The correct way (in most cases) is with RAII. See scope guards (std::experimental::scope_exit, absl::Cleanup, etc.) if you need helpers. Those are not "way harder" to deal with, and whether the control flow out of the function is obvious or not is completely irrelevant to them -- in fact, that's kind of their point.

In fact, they're better than both exception handling and error codes in at least one respect: they actually put the cleanup code next to the setup code, making it harder for them to go out of sync.

link

dwattttt 530 days ago

None of those are easier than not needing to do it at all though; if your functions exits are only where you specify, you can cleanup only once on those paths.

link

dataflow 530 days ago

> None of those are easier than not needing to do it at all though; if your functions exits are only where you specify, you can cleanup only once on those paths.

Huh? I don't get it. This:

  stack.push_back(k);
  absl::Cleanup _ = [&] { assert(stack.back() == k); stack.pop_back(); }
  if (foo()) {
    printf("foo()\n");
    return 1;
  }
  if (bar()) {
    printf("bar()\n");
    return 2;
  }
  baz();
  return 3;

is both easier, more readable, and more robust than:

  stack.push_back(k);
  if (foo()) {
    printf("foo()\n");
    assert(stack.back() == k);
    stack.pop_back();
    return 1;
  }
  if (bar()) {
    printf("bar()\n");
    assert(stack.back() == k);
    stack.pop_back();
    return 2;
  }
  baz();
  assert(stack.back() == k);
  stack.pop_back();
  return 3;

as well as:

  stack.push_back(k);
  auto pop_stack = [&] { assert(stack.back() == k); stack.pop_back(); }
  if (foo()) {
    printf("foo()\n");
    pop_stack();
    return 1;
  }
  if (bar()) {
    printf("bar()\n");
    pop_stack();
    return 2;
  }
  baz();
  pop_stack();
  return 3;

and unlike the others, it avoids repeating the same code three times.

(Ironically, I missed the manual cleanups before the final returns in the last two examples right as I posted this comment. Edited to fix now, but that itself should say something about which approach is actually more bug-prone...)

link

dwattttt 530 days ago

I can't parse this super well on mobile, but what invariant is this maintaining? I was imagining a function that manipulated a collection, and e.g. needed to decrement a length field to ensure the observable elements remained valid, then increment it, then do something else.

The gnarliest scenario I recall was a ring-buffer implementation that relied on a field always being within the valid length, and a single code path not performing a mod operation, which was only observably a problem after a specific sequence of reserving, popping, and pushing.

EDIT: oh, I think I see; is your code validating the invariant, or maintaining it?

link

dataflow 530 days ago

> I can't parse this super well on mobile, but what invariant is this maintaining.

The stack length (and contents, too). It pushes, but ensures a pop occurs upon returning. So the stack looks the same before and after.

> I was imagining a function that manipulated a collection, and e.g. needed to decrement a length field to ensure the observable elements remained valid, then increment it, then do something else.

That is exactly what the code is doing.

> EDIT: oh, I think I see; is your code validating the invariant, or maintaining it?

Both. First it manipulates the stack (pushing onto it), then it does some stuff. Then before returning, it validates that the last element is still the one pushed, then pops that element, returning the stack to its original length & state.

> The gnarliest scenario I recall was a ring-buffer implementation that [...]

That sounds like the kind of thing scope guards would be good at.

link

SubjectToChange 530 days ago

>Just from a quick glance: I see he's talking about things like stack overflows and std::bad_alloc.

There are specific scenarios that a major issue, yes. But as the title of the video implies, the problem with exceptions runs far deeper. Imagine being a C++ library author who wants to support as many users as possible, you simply couldn't use exceptions even if you wanted to, and even if most of your users are using exceptions. The end result is that projects that use exceptions have to deal with two different methods of error handling, i.e. they get the worst of both worlds (the binary footprint of exceptions, the overhead of constantly checking error codes, and the mental overhead of dealing with it all).

C++ exceptions are a genuinely useful language feature. But I wish the language and standard library wasn't designed around exceptions. C++ has managed to displace C almost everywhere except embedded and/or kernel programming, and exceptions are a big reason for that.

link

dataflow 530 days ago

> Imagine being a C++ library author who wants to support as many users as possible, you simply couldn't use exceptions even if you wanted to

I'm pretty sure that (much) less than 50% of the C++ code out there is "a C++ library that wants to support as many users as possible" -- I imagine most code is application code, not even C++ library code in the first place. It's perfectly fine to throw e.g. a "network connection was closed" or "failed to write to disk" exception and then catch it somewhere up the stack.

> The end result is that projects that use exceptions have to deal with two different methods of error handling. i.e. they get the worst of both worlds

No, that's not true. You might get a bit of marginal overhead to think about, but it's not the worst of both whatsoever. If you want to use exceptions and your library doesn't use them, all you gotta do is wrap the foo() call in CheckForErrors(foo()), and then handle it (if you want to handle it at all) at the top level of your call chain. It's not the worst of both worlds at all -- in fact it's literally less work than simply writing

  std::expected<Result, std::error_code> e = foo();

and on top of that you get to avoid the constant checking of error codes and modifying every intermediate caller, leaving their code much simpler and more readable.

And of course if you don't want to use exceptions but your library does use them, then of course you can do the reverse:

  std::expected<Result, std::error_code> e = CallAndCatchError(foo()).

Nobody is claiming every error should be an exception. I'm just saying you're exaggerating and extrapolating the arguments too far. A sane project would have a mix of different error models, and that would very much still be the case if none of the problems you mentioned existed at all, because they're different tools solving different problems.

link

tialaramex 530 days ago

> Do you really want an error returned from push_back?

For most people, no, you definitely want it to just work or explode, which is indeed what happens in normal Rust, and, not coincidentally, the actual effect when this exception happens in your typical C++ application after it is done with all the unwinding and discovers there is no handler (or that the handler was never tested and doesn't actually somehow cope).

But, sometimes that is what you wanted, and Linus has been very clear it's what he wants in the kernel he created.

For such purposes Rust has Vec::try_reserve() and Vec::push_within_capacity() which let us express the idea that we'd like more room and to know if that wasn't possible, and also if there was no room left for the thing we pushed we want back the thing we were trying to push - which otherwise we don't have any more.

There is no analogous C++ API, std::vector just throws an exception and good luck to you AFAIK.

link

dataflow 530 days ago

> For such purposes Rust has Vec::try_reserve() and Vec::push_within_capacity() [...] There is no analogous C++ API, std::vector just throws an exception and good luck to you AFAIK.

https://godbolt.org/z/6xE6jr3zr ?

link

tialaramex 530 days ago

I guess this is an attempt at Vec::push_within_capacity ? Your function takes a reference and then tries to copy the referenced object into the growable array. But of course nobody said this object can be copied - after all we want it back if it won't fit so perhaps it's unique or expensive to make.

link

dataflow 530 days ago

> I guess this is an attempt at Vec::push_within_capacity?

Sure, yes. It's trivial to change to try_reserve if that's what you want. (There are other solutions for that as well, but they're more complicated and better for other situations.)

> Your function takes a reference and then tries to copy the referenced object into the growable array. But of course nobody said this object can be copied - after all we want it back if it won't fit so perhaps it's unique or expensive to make

Just add extend it to allow moves then? It's pretty trivial. (Are you familiar with move semantics in C++?)

link

tialaramex 530 days ago

But how? I did attempt this before I replied, but of course after not long I had inexplicable segfaults and we're not in a thread about those problems with C++

I can't see how to make that work, but I also can't say for sure it's impossible all I can tell you is that I was genuinely trying and all I got for my trouble was a segfault that I don't understand and couldn't fix.

Edited to add: In case it helps the signature we want is:

    pub fn push_within_capacity(&mut self, value: T) -> Result<(), T>

If you're not really a Rust person, this takes a value T, not a reference, not a magic ultra-hyper-reference, nor a pointer, it's taking the value T, the value is gone now, which just isn't a thing in C++, then it's returning either Ok(()) which signifies that this worked, or Err(T) thus giving back the T because we couldn't push it.

link

dataflow 530 days ago

I'm sorry I don't think I understand the problem you're trying to illustrate. I'm not sure why you're emphasizing value vs. reference, but even if that's what you want, this works just fine: https://godbolt.org/z/P8EGPYWW5

link