Hacker News new | ask | show | jobs
by steveklabnik 3741 days ago

  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
So, first of all, "custom allocators" means two things:

  * overloading the allocator that's used by liballoc, and
    the crates that depend on it, like libstd
  * other allocators entirely
The first is described here: https://doc.rust-lang.org/book/custom-allocators.html

And the second is still in RFCs: https://github.com/rust-lang/rfcs/pull/1398

Both of these things are not yet stable. The second does, in fact, give you the ability to return an error code, by returning a Result.

However, on top of that, I don't see how

  >  mostly a consequence of eschewing exceptions.
and

  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
Work together. Or rather, why is panic-ing bad, but an exception good?
1 comments

> why is panic-ing bad, but an exception good?

Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

Even if abort-on-panic were to be killed as a legal mode of operation, and even if the stigma were to be removed from std::panic::recover, we'd still be left with a language with two error handling strategies and endless programmer confusion over which to use.

Rust's designers have done permanent damage to the language by not making exceptions the primary error reporting mechanism available to programmers, and it's not a mistake they can undo now.

> Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

recover() exists. You're right, there's a stigma to it, because you're not supposed to use it unless you really need to (hence, no programmer confusion). It's supposed to be used for situations like:

- Catching panics before crossing an FFI boundary

- End-of-the-world situations like OOM where you want to still handle it somehow

- Ensuring that applications can recover from internal panics in libraries (though there should be little to no panics in the libraries anyway)

The stigma for recover is for using it where you're not supposed to; as a substitute for regular error handling. In this situation, you are supposed to, so the stigma doesn't apply.

The fact that it's not a first-class primitive seems mostly irrelevant to me. Rust does a lot of things in library functions and types, even our concurrency safety mechanisms are something that can be duplicated in a library. As long as it can be used, what does it matter?

The fact that you can set the panic handler at runtime is also irrelevant. If you want to catch panics, don't do that.

The problem with the dualistic error handling strategy you're proposing is that the "severe" path gets even less testing than normal error recovery schemes do. Imagine you're working with a big non-exceptional C++ codebase (e.g., Firefox) and somebody throws std::bad_alloc. Even if you don't abort immediately and let the exception unwind the stack, the unwinding process will still leave lots of invariants broken, since all the cleanup paths are wired to return codes and will not run on unwinding.

The result is that your program can be almost arbitrarily broken after throwing. You might as well have just called longjmp.

It's because unwinding in only rare cases often produces bad results that I favor making unwinding the only error-reporting machinery in a language. If you use exceptions to report all errors, everyone starts caring about exception safety again.

Note that recover() uses Rust's type system to enforce certain things about exception safety. It's harder to mess up, even if libraries are written without unwinding in mind.
Exceptions can be turned into aborts in C++ as well, and are in many types of programs, because exceptions do have downsides for some problem domains. If Rust forced exceptions on everyone, there'd be people complaining about that just like you're complaining now.

I see the split between `Result` and `panic!` as more like Java's split between checked and runtime exceptions, except `Result` is much more usable than checked exceptions because it's part of the main data flow path, and so can use method chaining combinators instead of unwieldy try/catch blocks. OOM in Java is, like in Rust, not a checked exception, because it's not something you'd want to handle everywhere it can happen, but rather something to propagate up the stack transparently.

> Exceptions can be turned into aborts in C++ as well,

No you can't. -fno-exceptions does not appear in the C++ standard. You can write a compiler for any language. C++-that-aborts-on-throw is not C++, although, sure, it's closely related.

The ability to turn off C++ exceptions was a temporary workaround for compiler deficiencies in the 1990s that snowballed into an extremely harmful schism that's still doing tremendous damage to the C++ community.

The difference between -fno-exceptions and Rust's abort-on-panic is that the former is an unofficial, disgusting hack, while the latter is getting full official support for some reason.

That's not a very meaningful distinction to make- Rust doesn't even have a standard right now. Besides, -fno-exceptions is quite useful today, not just because of 90s compiler deficiencies, and is pretty well-supported by compilers.
The existence of -fno-exceptions means that library authors either using the language as intended, and accept losing a portion of their potential user base, or write less-than-optimally elegant and clear code, which punishes everyone, so a few can turn off a core feature of the language. It fragments the community.
This is an area where you just can't actually please everyone. I have heard the same opinions you've expressed in this thread, just as strongly, for even including unwinding at all. That aborts should be the only option, and that the cost of unwinding is far too high to be included in a true systems language.

Language design is tough. I'm glad we have multiple languages.

It's _because_ Rust tried to please everyone that it painted itself into this corner. If the exception people had won, life would have been great.

But if the error-code people had won, then life would still be good, because then Rust's stdlib might have been a bit uglier, but it would at least be correct with respect to error propagation. It's because Rust tries to satisfy both camps --- because it tries to give you the concision of exception code and, er, the lack of actual exceptions --- that it's forced into the terrible position of needing to abort internally on error, lacking a way to report errors to higher level code.

The lesson here is that optimizing for happiness and harmony leads to bad design.

I prefer "taking all use-cases seriously instead of abandoning a segment of users" to "happiness and harmony," as a characterization here. If serious use cases were not presented for both options, we would have enforced one. Or, if Rus weren't a systems language, we could have enforced one.

At the end of the day, if you have exceptions, you can still call abort in your exception handlers, so the split exists regardless. And without first-class support, those users are paying for a feature that they aren't using, which is against a core value of Rust.

You are arguing for replacing bad behavior "abort on OOM" with something even worse, exceptions. I honestly don't think you know what exceptions entail wrt what compilers do and the resulting bloat.
What, unwind tables? The ones that go untouched in normal operation? They're hardly catastrophic, and you need unwind support as a mandatory part of some ABIs in the first place. I know perfectly well what exceptions entail, and I maintain they're vastly better than other error handling strategies. You're the one who doesn't know what he's talking about.
You do understand it's usually in the ballpark of +20% or more to text or so that is in the loaded part of the program right? Also define mandatory, what requires eh_frame..?
There is an exception-like mechanism in Rust, in the form of the "try!" macro. It's a lot more flexible, but somewhat more verbose (Haskell has the same mechanism in a way that looks a lot more like exceptions, so that's not an inherent flaw). The best explanation I've seen is this:

http://www.jonathanturner.org/2015/11/learning-to-try-things...

tl;dr: "Result"s are like exceptions which are caught by default. You can (explicitly) propagate them upwards by using try!(...). This is nice because it means that you can tell what exceptions can occur in a block of code only using "local" information.

> There is an exception-like mechanism in Rust, in the form of the "try!" macro.

Correct. That's not the problem. If Rust's standard library returned Result in all cases where allocation could fail, I'd be satisfied. My primary issue is that they didn't, because Result is awkward.

Rust's designers went wrong in trying to have their cake and eat it too. They wanted to avoid exceptions and not make people care locally about errors. That's why they assert that errors just don't happen and abort if they do.

Throwing exceptions is a reasonable design choice. Returning error codes is a reasonable design choice. Pretending errors don't exist is not.

> Pretending errors don't exist is not.

We don't and we never have.

I don't think that there's any guarantee in Rust that malloc failure will abort rather than panic. That just happens to be the current implementation. I'm not sure I've ever heard of anyone running into that being an issue in practice, as opposed to this kind of abstract discussion. But I think that it wouldn't be considered a breaking change to switch from aborting to panicking if there were any kind of demand for it.

In Rust, exceptions (panic) are used for truly exceptional situations, like programmer error (indexing beyond the end of an array, division by zero) or things that practically are not expected to happen in a recoverable way in the course of ordinary use, like malloc failure. On modern virtual memory operating systems, malloc failure is so unlikely, and in application code there's so little you could reasonably do if it happened, that it is considered be a truly exceptional case.

On the other hand, Result is used for those kinds of errors that are expected to happen in practice even with working code on reasonable systems. IO errors, errors decoding UTF-8, etc.

Right now, catching exceptions (panics) using recover() is still considered unstable. There is some work ongoing to try and work out the API to help ensure safety, by marking types based on whether they are exception-safe or not; so you can use recover() with types that are built in an exception-safe way, or you can wrap types in AssertRecoverSafe to assert that you are providing exception-safety guarantees yourself, but you can't just arbitrarily recover from panics in code that has access to arbitrary data without someone having added an annotation somewhere that they believe that the code is exception-safe. https://github.com/rust-lang/rust/issues/27719 Note that based on the latest discussion, recover() will likely be named something else involving "unwind" to be more explicit about what it's doing.

And exception safety is quite important to the Rust authors. Note that Mutex has a built-in exception safety mechanism, poisoning the mutex on panic so that other users can't accidentally access the protected resource without being aware that another thread panicked while holding it.

Now, there are times when handling memory allocation failures properly is more important, such as in embedded systems or in operating system kernels, where you don't have a virtual memory abstraction with over-provisioning. However, in those cases you couldn't use the standard library anyhow, as the standard library depends on OS support; so you might as well use alternate data types that do return Result on allocating operations.

I'm just not sure about the utility of providing a convenient way to recover from malloc failure in applications running on virtual-memory operating systems. Can you show me an example in C++ (or any other language) where this is handled properly in application code in any way that doesn't simply log and abort, in which all unwinding code in the same application also avoids allocation as it may occur while unwinding from an allocation failure, and in which these code paths are actually tested in the test suite to ensure they behave properly?

> Right now, catching exceptions (panics) using recover() is still considered unstable. ... you can't just arbitrarily recover from panics in code that has access to arbitrary data without someone having added an annotation somewhere that they believe that the code is exception-safe

And it's for this reason that I don't think I'll be choosing Rust for any of my projects in the near future. This cavalier attitude toward memory exhaustion is not only concerning itself, but also makes me doubt the robustness and design principles of the rest of the system.

Besides, if you make exception-safe code difficult to write, nobody in practice will write it, so you'll end up with a system that's tantamount to one that just aborts. Saying that "Rust the language handled OOM just fine without stdlib!" and "we can convert OOM to panic!" is useless when these measures don't help real world code.

> In Rust, exceptions (panic) are used for truly exceptional situations

I've never accepted the argument that we need to use one error-recovery scheme for "normal" errors and another for "exceptional" ones. That kind of claim sounds reasonable, sober, and measured, but it leads to bad outcomes in every system I've seen, because the "exceptional" case in practice becomes a hard abort. A unified error handling scheme is a boon because it greatly simplified the cognitive analysis of errors.

Java is a good example of how to do right-ish. Serious errors are Throwables not derived from Exception, so normal catch blocks are unlikely to catch them. But serious errors are still exceptions (if not Exception), and all the usual language features for processing exceptions, including unwinding, stack trace recording, and chaining, operate normally.

Uniformity of error processing in Java is a great feature, and the language gets it without sacrificing the ability to distinguish between serious and expected errors. Now, I'm not arguing that Rust get checked exceptions, but I do have to insist that experience shows that you don't need two completely different error handling mechanisms (say, panic and Result) to mark problem severity.

> But I think that it wouldn't be considered a breaking change to switch from aborting to panicking if there were any kind of demand for it.

I'm not comfortable to casual changes in core runtime semantics.

> On modern virtual memory operating systems,

Are you just defining "modern" as "overcommit"? People (especially from the GNU/Linux world) constantly assert that allocation failure is rare, but I've seen allocations fail plenty of times, due to both address space exhaustion and global memory exhaustion. I don't have any firm numbers, but I haven't seen any from the abort-on-failure camp either.

> Can you show me an example in C++ (or any other language) where this is handled properly in application code in any way that doesn't simply log and abort, in which all unwinding code in the same application also avoids allocation as it may occur while unwinding from an allocation failure, and in which these code paths are actually tested in the test suite to ensure they behave properly?

SQLite [1] and NTFS [2] come to mind, as well as lots of tools I've discovered.

[1] https://www.sqlite.org/malloc.html

[2] guaranteed to make forward progress; pre-reserves all needed recovery resources; yes, I know NTFS runs in ring zero, but it's not the case that the kernel doesn't have to deal with dynamic memory allocation

  Besides, if you make exception-safe code difficult to 
  write, nobody in practice will write it, so you'll end up 
  with a system that's tantamount to one that just aborts. 
  Saying that "Rust the language handled OOM just fine 
  without stdlib!" and "we can convert OOM to panic!" is 
  useless when these measures don't help real world code.
I'm not sure where you get the "difficult to write" part from. It's no more or less difficult to write than in any other language, as far as I know; you just do have to go through the effort to indicate that "yes, I did think this through and believe this is exception safe" for types that you want to be able to use across an exception-catching boundary.

As I said, work is ongoing to determine if this AssertUnwindSafe approach is actually workable in practice. The initial implementation had some usability issues, but it looks like it may be more workable now that you can use it on the entire closure if you need to. It's still a speedbump, but a very minor one.

  That kind of claim sounds reasonable, sober, and 
  measured, but it leads to bad outcomes in every system 
  I've seen, because the "exceptional" case in practice 
  becomes a hard abort.
Can you point out what these bad outcomes or bad systems have been? I agree that in practice, the most common case is that the exceptional case becomes a hard abort, but I don't necessarily agree that that's a bad thing.

For people who are not trying to write extremely fault-tolerant code like SQLite, and going to great lengths to do so, that is a good thing; adding some half-assed normal error handling around these truly exceptional cases is more likely to lead to mistakes and problems down the line than just aborting is.

For people who are trying to write extremely robust, fault tolerant code, you can either handle panics, or avoid the standard library and do error handling via results. Both should approaches should be viable, depending on your requirements; the standard library does take exception safety into account, so it shouldn't on its own cause issues if you handle errors via panics.

  I'm not comfortable to casual changes in core runtime 
  semantics.
But you are comfortable with the sheer amount of undefined and unspecified behavior in C and C++? Remember, at the moment Rust only has a single implementation and no formal specification, while C and C++ have many different implementations, and the standards allow very wide amounts of leeway in how implementations differ.

Now, Rust not having a formal specification or multiple implementations is not a good thing; it's just a fact of life for a language that is not yet very mature. But I think that this particular behavior is something that should be considered similar to unspecified behavior at the moment. Just like out of memory situations or stack overflow behave differently on different platforms in C and C++ at the moment, how the Rust runtime behaves on out of memory could also be subject to change or different implementations. Given the standard library API, you couldn't return a result, but either aborting or panicking would both be consistent with the language as currently defined.

  People (especially from the GNU/Linux world) constantly 
  assert that allocation failure is rare
I'm not asserting that allocation failure is rare. Just that there are some cases where you don't have a chance to handle it at all, like GNU/Linux where you overcommit, and that handling it in any way other than abort is rare.

  SQLite [1] and NTFS [2] come to mind, as well as lots of 
  tools I've discovered.
Neither SQLite nor NTFS use exceptions, nor are they applications, so they aren't very good examples of applications using exception handling to deal with memory allocation failure.

SQLite is written in C, which doesn't have exceptions, nor a standard library similar to the C++ or Rust standard library. SQLite has had to implement all of their data structures by hand. You can do exactly the same in Rust by using #![no_std] and just using the core library, which only defines basic data types and never allocates.

NTFS is written in the NT kernel, which doesn't have support for exceptions either, nor does it use the C++ standard library.

So yes, you can actually write code that handles allocation failure properly. The examples you've given both eschew a high-level standard library, and instead implement all of their data structures and memory handling themselves, reporting errors by passing error values back. All of which you can do in Rust using #![no_std].

Meanwhile, there are lots of user-space applications that people use all the time which have no special handling for OOM situations; they rely on the OS to provide them with sufficient amounts of virtual memory, and either be killed by not handling an exception, aborting explicitly on getting NULL from malloc, or being killed by an OOM killer if they exceed the capacity of the machine and try to access an overcommitted page.

I'm sure there are some examples out there, somewhere, of user-space applications that actually do catch such issues, and attempt to do graceful cleanup. On the other hand, I don't know how successful they will be, especially if they have to be cross-platform; since any kind of cleanup you may do, such as writing state out to disk before dying, will hit the kernel's page cache, which may involve allocating memory, which may fail in such a situation, even if you do try to handle the issue gracefully in user-space you may not have anything you can do.

There was a really interesting article on error handling in languages recently:

http://joeduffyblog.com/2016/02/07/the-error-model/

It makes the case that you do in fact want two different error handling mechanisms, because there are two quite different kinds of errors. The author argues that running out of memory is most practically treated as an unrecoverable error which aborts the process.