Hacker News new | ask | show | jobs
by Manishearth 3742 days ago
A failed malloc aborts the process. If this is important to you, don't use the heap abstractions in the stdlib then. This is no different from the situation in C++.

(You can also plug in a custom allocator which behaves differently)

2 comments

On Linux malloc never fails actually. Instead, the kernel kills processes if it runs out of memory.
Overcommit doesn't mean that malloc never fails. malloc will fail if you ask for an allocation that won't fit in the virtual address space (for instance because of your rlimit settings, or if you asked for a 3G chunk on a 32-bit system).
A lot of folks are pointing out that malloc can fail, which is true, but the important part is that there are situations where your application will just abort randomly in the middle of nowhere (i.e. not during malloc) and there's nothing you can do about it. There are also situations where malloc fails and returns a null, but given the existence of situations with no errors on a malloc, handling the error in these cases isn't close to a solution. No language or stdlib can solve this problem 100%.

At a baremetal level, it helps having this, but you're probably better off not using the stdlib anyway.

Bullshit. Every part of the C++ standard library (which is _much_ bigger than Rust's) can gracefully propagate resource exhaustion errors, including memory allocation failure, to callers. That Rust can't is a design flaw.

> there are situations where your application will just abort randomly

That's sure as hell not the case in any environment I choose to use.

> can gracefully propagate resource exhaustion errors, including memory allocation failure, to callers

only on malloc. If the kernel overcommits, your process will abort when you try to use the memory, possibly way after the malloc and there's nothing you can do about it. That's the point being made here.

> That Rust can't is a design flaw.

(This is false, see Steve's reply above about this)

The world is not Linux. I happen to believe that believe that overcommit in the Linux kernel is a disgrace. It is, however, at least possible to disable it. It's not possible to retroactively add real exceptions to Rust, or to change the signature of all memory-allocation function to return Result.

Rust is supposed to be a general-purpose systems programming language, not a Linux programming language. Windows does not overcommit. A correctly configured Linux system does not overcommit. Lost of embedded systems don't (and can't) overcommit. Are you saying all of these people should avoid Rust's standard library?

> > That Rust can't is a design flaw.

> (This is false, see Steve's reply above about this)

It's clear that my opinion differs from that of many Rust developers and users. I still think I'm correct, that these developers and users are misguided, and that as Rust attempts to fill more niches, experience will show that my position is the correct one.

All I can say is that I personally will not use any language that bakes cornucopian assumptions about memory baked into its core library. I know that you say that it's possible to just avoid stdlib --- but the temptation to use it will be irresistible, and once somebody succumbs to the temptation, the entire program is now capable of aborting irrecoverably.

I will stick with languages . Modern C++ is safe and expressive enough, and it correctly reacts to resource exhaustion.

> The world is not Linux.

Sure, but if linux has this issue, then C++ programs on linux will also have this issue, and the language can't solve that. That's all my point was.

> or to change the signature of all memory-allocation function to return Result.

When custom allocators part 2 happens, you can. I've already argued the "real exceptions" part above.

> Rust is supposed to be a general-purpose systems programming language, not a Linux programming language. Windows does not overcommit. A correctly configured Linux system does not overcommit. Lost of embedded systems don't (and can't) overcommit. Are you saying all of these people should avoid Rust's standard library?

No. My point was simply that no language has a complete solution to this problem.

Most people don't need to worry about OOM; abort-on-OOM is the expected behavior. For the people who do, there is a mechanism to handle it, as explained above. I can't help it if you have an idealogical issue with that mechanism. But ultimately, it works and can be used.

> On Linux malloc never fails actually.

Not true. For example, it can fail if there is no big enough chunk of virtual address space available (i.e. your heap is fragmented enough and your attempted allocation is big enough). I've even seen 64-bit processes manage to do this, my mmapping lots of multi-GB things at once and then trying to do large allocations.

Overcommit will fail if you go over the overcommit ratio. In addition, the process will die if you start using the fake pages the kernel gave you. But yes, malloc can fail on Linux.
You're correct in the way that it's default behavior, but the behavior can be turned off.

Also, if you're using cgroups (or anything that leverages that like containers) you can put a limit on the memory resource and then malloc will fail. Which is common enough, and only becoming more common as people are collocating a lot of disparate workloads on nodes (using the kurbenets scheduler).

No custom allocator can make a task that fails to allocate gracefully report an error. Rust's error handling design is just terrible, and mostly a consequence of eschewing exceptions.

Had Rust opted for exceptions, it'd be a much better, and actually usable, language. Rust's terribly error-handling strategy is the chief reason not to use it.

> No custom allocator can make a task that fails to allocate gracefully report an error.

Yes, you can. You can panic the thread, and you can recover from panics. But honestly this is never enough to gracefully recover from OOM. I can't think of any software that uses exceptions to gracefully recover from OOM (i.e. without crashing the process) in a way that works. How many Java applications do you see that catch OutOfMemoryError on a fine-grained level?

> Had Rust opted for exceptions, it'd be a much better, and actually usable, language. Rust's terribly error-handling strategy is the chief reason not to use it.

It's hyperbolic to call Rust's error handling strategy not "actually usable". I use it every day and never have any issues with it.

> How many Java applications do you see that catch OutOfMemoryError on a fine-grained level?

I wrote exactly that sort of code, in Java, last week.

I believe you that you do. That doesn't change the fact that few people actually do it or need to do it. There are many more people who think they need to recover from OOM but actually don't and would be better off if they didn't try. (There have been many security vulnerabilities that have resulted from attempting to handle OOM gracefully that wouldn't have been an issue if malloc just aborted the process.)
> That doesn't change the fact that few people actually do it or need to do it.

I am one of those few people. We are running some scientific code (currently, unfortunately, written in C++) on a heterogeneous bunch of compute nodes. Some computations can be extremely memory-intensive, and sometimes in ways that we didn't predict. So it's useful to be able to fail gracefully and record that computation X on node Y with input parameters [Z] failed specifically due to running out of memory at step W - so that e.g. the queue manager can try relaunching the computation on a beefier node or adjusting how many instances of which computation are allowed on Y.

That's something that Rust fully supports with panic handlers. Doing arbitrary work before the process goes down is useful and supported. (But you will have to be running Linux in a non-default configuration for it to be reliable, of course!)
I agree with you, but a general-purpose systems programming language needs to let _me_ make that determination. It can't abort on my behalf for my own good. It's depressing that Java, of all things, does a better job in this respect than Rust.

And we probably shouldn't be writing so much software in general-purpose systems programming languages.

> I agree with you, but a general-purpose systems programming language needs to let _me_ make that determination. It can't abort on my behalf for my own good.

You can decide. You can use the standard library and deal with exceptions, or you can not use the standard library and deal with malloc failure yourself. The Rust standard library is opinionated in this regard, because it's rarely ever a good idea to try to recover from malloc failure for userland processes.

That said, with recover, which will probably be stabilized, you can recover from malloc problems, which are turned into panics. But I'm sure you know that this can be unreliable on Linux with the default overcommit turned on, and so forth. https://doc.rust-lang.org/std/panic/fn.recover.html

Note all of the debate on the linked issue as to whether recover is a good idea. Most of the Rust community is very hesitant to even allow catching panics at all; they certainly don't find the current situation "unusable".

> It's depressing that Java, of all things, does a better job in this respect than Rust.

I think that Java shouldn't throw exceptions on OOM. It should just abort the process.

> and mostly a consequence of eschewing exceptions.

A panic is the same thing as an exception. If you want to catch a panic, use recover(), it's meant to be used exactly for these end-of-the-world panic scenarios (and for FFI/etc).

You can plug in a custom allocator that panics on OOM (I think the standard one aborts).

As Steve mentioned, custom allocators can mean two things. The type that exists in rust today is one where you can make OOM panic, but not have allocation methods return Result. A planned extension will let you have allocators with different semantics entirely work with stdlib types (via defaulted type parameters); and this will let you use regular error handling with stdlib types too.

> A planned extension will let you have allocators with different semantics entirely work with stdlib types

And what about all the code that doesn't? It's because so much code exists that's completely oblivious to the possibility of these stdlib functions failing that I don't think that merely adding the option to do the right thing is good enough. The existing failure-oblivious APIs need to be explicitly deprecated.

The only ways to redeem Rust is to either support exceptions as first-class citizens with mandatory runtime support or to convert all existing allocating stdlib functions to return Result and mark all the existing failure-oblivious ones as being as deprecated as gets(3) in C.

> The existing failure-oblivious APIs need to be explicitly deprecated.

That's total overkill. For 99% of applications, process abort is fine, and dealing with it is just noise. Those 1% are usually things like kernels that use custom standard libraries anyway.

We're not doing the 99% a favor by making them think about OOM every time they do something that might allocate.

> The only ways to redeem Rust is to either support exceptions as first-class citizens with mandatory runtime support or to convert all existing allocating stdlib functions to return Result and mark all the existing failure-oblivious ones as being as deprecated as gets(3) in C.

This is silly hyperbole. Ask anyone who works in security whether the danger of xmalloc() is comparable to the danger of gets(). In fact, I've seen many security folks recommend only using xmalloc() with process abort instead of trying to explicitly handle OOM failures!

Even C++ doesn't have mandatory exception support, and even Rust can catch panics from failure-oblivious code.
> Even C++ doesn't have mandatory exception support

Yes it does. That some compilers provide a way to disable mandatory language features is no argument.

> even Rust can catch panics from failure-oblivious code.

Not while maintaining that code's invariants it can't.

> Yes it does. That some compilers provide a way to disable mandatory language features is no argument.

It's actually very relevant that huge amounts of C++ deployed in the world use -fno-exceptions, and many shops (for example, Google!) have a policy of "we do not use exceptions". I don't care about how well languages handle OOM in theory; what matters is how well they handle it in practice.

Are you saying C++ make it easy to write exception-safe code? Because Rust explicitly encodes exception safety into the type system with the RecoverySafe trait, you need to write unsafe code to bypass that, and the documentation on unsafe explicitly covers exception safety.

  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
So, first of all, "custom allocators" means two things:

  * overloading the allocator that's used by liballoc, and
    the crates that depend on it, like libstd
  * other allocators entirely
The first is described here: https://doc.rust-lang.org/book/custom-allocators.html

And the second is still in RFCs: https://github.com/rust-lang/rfcs/pull/1398

Both of these things are not yet stable. The second does, in fact, give you the ability to return an error code, by returning a Result.

However, on top of that, I don't see how

  >  mostly a consequence of eschewing exceptions.
and

  > No custom allocator can make the task gracefully report failure
  > instead of panicing.
Work together. Or rather, why is panic-ing bad, but an exception good?
> why is panic-ing bad, but an exception good?

Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

Even if abort-on-panic were to be killed as a legal mode of operation, and even if the stigma were to be removed from std::panic::recover, we'd still be left with a language with two error handling strategies and endless programmer confusion over which to use.

Rust's designers have done permanent damage to the language by not making exceptions the primary error reporting mechanism available to programmers, and it's not a mistake they can undo now.

> Because the Rust people don't believe in making "catch" a first-class primitive in the language, and in fact, fully support a runtime option to turn all panics into aborts.

recover() exists. You're right, there's a stigma to it, because you're not supposed to use it unless you really need to (hence, no programmer confusion). It's supposed to be used for situations like:

- Catching panics before crossing an FFI boundary

- End-of-the-world situations like OOM where you want to still handle it somehow

- Ensuring that applications can recover from internal panics in libraries (though there should be little to no panics in the libraries anyway)

The stigma for recover is for using it where you're not supposed to; as a substitute for regular error handling. In this situation, you are supposed to, so the stigma doesn't apply.

The fact that it's not a first-class primitive seems mostly irrelevant to me. Rust does a lot of things in library functions and types, even our concurrency safety mechanisms are something that can be duplicated in a library. As long as it can be used, what does it matter?

The fact that you can set the panic handler at runtime is also irrelevant. If you want to catch panics, don't do that.

The problem with the dualistic error handling strategy you're proposing is that the "severe" path gets even less testing than normal error recovery schemes do. Imagine you're working with a big non-exceptional C++ codebase (e.g., Firefox) and somebody throws std::bad_alloc. Even if you don't abort immediately and let the exception unwind the stack, the unwinding process will still leave lots of invariants broken, since all the cleanup paths are wired to return codes and will not run on unwinding.

The result is that your program can be almost arbitrarily broken after throwing. You might as well have just called longjmp.

It's because unwinding in only rare cases often produces bad results that I favor making unwinding the only error-reporting machinery in a language. If you use exceptions to report all errors, everyone starts caring about exception safety again.

Note that recover() uses Rust's type system to enforce certain things about exception safety. It's harder to mess up, even if libraries are written without unwinding in mind.
Exceptions can be turned into aborts in C++ as well, and are in many types of programs, because exceptions do have downsides for some problem domains. If Rust forced exceptions on everyone, there'd be people complaining about that just like you're complaining now.

I see the split between `Result` and `panic!` as more like Java's split between checked and runtime exceptions, except `Result` is much more usable than checked exceptions because it's part of the main data flow path, and so can use method chaining combinators instead of unwieldy try/catch blocks. OOM in Java is, like in Rust, not a checked exception, because it's not something you'd want to handle everywhere it can happen, but rather something to propagate up the stack transparently.

> Exceptions can be turned into aborts in C++ as well,

No you can't. -fno-exceptions does not appear in the C++ standard. You can write a compiler for any language. C++-that-aborts-on-throw is not C++, although, sure, it's closely related.

The ability to turn off C++ exceptions was a temporary workaround for compiler deficiencies in the 1990s that snowballed into an extremely harmful schism that's still doing tremendous damage to the C++ community.

The difference between -fno-exceptions and Rust's abort-on-panic is that the former is an unofficial, disgusting hack, while the latter is getting full official support for some reason.

That's not a very meaningful distinction to make- Rust doesn't even have a standard right now. Besides, -fno-exceptions is quite useful today, not just because of 90s compiler deficiencies, and is pretty well-supported by compilers.
This is an area where you just can't actually please everyone. I have heard the same opinions you've expressed in this thread, just as strongly, for even including unwinding at all. That aborts should be the only option, and that the cost of unwinding is far too high to be included in a true systems language.

Language design is tough. I'm glad we have multiple languages.

It's _because_ Rust tried to please everyone that it painted itself into this corner. If the exception people had won, life would have been great.

But if the error-code people had won, then life would still be good, because then Rust's stdlib might have been a bit uglier, but it would at least be correct with respect to error propagation. It's because Rust tries to satisfy both camps --- because it tries to give you the concision of exception code and, er, the lack of actual exceptions --- that it's forced into the terrible position of needing to abort internally on error, lacking a way to report errors to higher level code.

The lesson here is that optimizing for happiness and harmony leads to bad design.

I prefer "taking all use-cases seriously instead of abandoning a segment of users" to "happiness and harmony," as a characterization here. If serious use cases were not presented for both options, we would have enforced one. Or, if Rus weren't a systems language, we could have enforced one.

At the end of the day, if you have exceptions, you can still call abort in your exception handlers, so the split exists regardless. And without first-class support, those users are paying for a feature that they aren't using, which is against a core value of Rust.

You are arguing for replacing bad behavior "abort on OOM" with something even worse, exceptions. I honestly don't think you know what exceptions entail wrt what compilers do and the resulting bloat.
There is an exception-like mechanism in Rust, in the form of the "try!" macro. It's a lot more flexible, but somewhat more verbose (Haskell has the same mechanism in a way that looks a lot more like exceptions, so that's not an inherent flaw). The best explanation I've seen is this:

http://www.jonathanturner.org/2015/11/learning-to-try-things...

tl;dr: "Result"s are like exceptions which are caught by default. You can (explicitly) propagate them upwards by using try!(...). This is nice because it means that you can tell what exceptions can occur in a block of code only using "local" information.

> There is an exception-like mechanism in Rust, in the form of the "try!" macro.

Correct. That's not the problem. If Rust's standard library returned Result in all cases where allocation could fail, I'd be satisfied. My primary issue is that they didn't, because Result is awkward.

Rust's designers went wrong in trying to have their cake and eat it too. They wanted to avoid exceptions and not make people care locally about errors. That's why they assert that errors just don't happen and abort if they do.

Throwing exceptions is a reasonable design choice. Returning error codes is a reasonable design choice. Pretending errors don't exist is not.

> Pretending errors don't exist is not.

We don't and we never have.

I don't think that there's any guarantee in Rust that malloc failure will abort rather than panic. That just happens to be the current implementation. I'm not sure I've ever heard of anyone running into that being an issue in practice, as opposed to this kind of abstract discussion. But I think that it wouldn't be considered a breaking change to switch from aborting to panicking if there were any kind of demand for it.

In Rust, exceptions (panic) are used for truly exceptional situations, like programmer error (indexing beyond the end of an array, division by zero) or things that practically are not expected to happen in a recoverable way in the course of ordinary use, like malloc failure. On modern virtual memory operating systems, malloc failure is so unlikely, and in application code there's so little you could reasonably do if it happened, that it is considered be a truly exceptional case.

On the other hand, Result is used for those kinds of errors that are expected to happen in practice even with working code on reasonable systems. IO errors, errors decoding UTF-8, etc.

Right now, catching exceptions (panics) using recover() is still considered unstable. There is some work ongoing to try and work out the API to help ensure safety, by marking types based on whether they are exception-safe or not; so you can use recover() with types that are built in an exception-safe way, or you can wrap types in AssertRecoverSafe to assert that you are providing exception-safety guarantees yourself, but you can't just arbitrarily recover from panics in code that has access to arbitrary data without someone having added an annotation somewhere that they believe that the code is exception-safe. https://github.com/rust-lang/rust/issues/27719 Note that based on the latest discussion, recover() will likely be named something else involving "unwind" to be more explicit about what it's doing.

And exception safety is quite important to the Rust authors. Note that Mutex has a built-in exception safety mechanism, poisoning the mutex on panic so that other users can't accidentally access the protected resource without being aware that another thread panicked while holding it.

Now, there are times when handling memory allocation failures properly is more important, such as in embedded systems or in operating system kernels, where you don't have a virtual memory abstraction with over-provisioning. However, in those cases you couldn't use the standard library anyhow, as the standard library depends on OS support; so you might as well use alternate data types that do return Result on allocating operations.

I'm just not sure about the utility of providing a convenient way to recover from malloc failure in applications running on virtual-memory operating systems. Can you show me an example in C++ (or any other language) where this is handled properly in application code in any way that doesn't simply log and abort, in which all unwinding code in the same application also avoids allocation as it may occur while unwinding from an allocation failure, and in which these code paths are actually tested in the test suite to ensure they behave properly?

There was a really interesting article on error handling in languages recently:

http://joeduffyblog.com/2016/02/07/the-error-model/

It makes the case that you do in fact want two different error handling mechanisms, because there are two quite different kinds of errors. The author argues that running out of memory is most practically treated as an unrecoverable error which aborts the process.