Hacker News new | ask | show | jobs
by jcelerier 2804 days ago
> when many C++ and Rust programs are about to end, they spend the last few cycles uselessly deallocating memory that would've immediately been freed via _exit(2)

thank god they do this. how many times did I have to manually force linux to release sockets because badly coded C programs which opened sockets forgot to release them causing them to hang up for ~5 minutes after the process ended. With proper RAII classes this does not happen.

4 comments

That has nothing to do with deallocating memory. Of course there are other kinds of resources which are not automatically freed when a program exits.
Do you mean orphaned sockets, stuck in FIN_WAIT?

Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function - on the socket in their destructor or whatever to prevent that. But I don't think the same applies for memory, once the process is destroyed the kernel should reclaim all memory in the process page tables automatically. Otherwise you'd end up with a pretty trivial way of disabling the system by exhausting all the memory...

> Surely what objects are are meant to do is call shutdown(2) syscall - or shutdown(3) C library function

well, the problem with non-RAII solutions is that you depend on the whims and talent of the programmer to call shutdown at some point. With a RAII solution like in C++ or Rust you know that if your socket opened successfully, a call to close will necessarily be issued.

Maybe I'm being dumb here, but with RAII in C++ at least, doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?
> doesn't shutdown() and then close() have to be called on the socket by the programmer explicitly in the destructor for the class?

yes, the class has to be written only once - and I have personnally never had to write it except in that one group project in school, since I use libraries that handle it - e.g. boost.asio or Qt Network.

If you are in C, even if you use an abstraction layer, you have to remember to call a _free / _destroy-like function every time you write some code that uses sockets.

Point being, the same is not true for virtual memory. You could leave the memory deallocation out of the destructors, and it is all going to be returned to the OS instantly on _exit().
Does _exit wipe out memory or just mark some regions as free ? asking for security purposes.
I don't know how Linux manages its memory pages. FreeBSD would put all of the anonymous pages onto essentially a free, but not zeroed queue. And there's an optional background job to zero the pages and put them in the zeroed queue. When a new page is needed, the clean queue is checked first, otherwise nonzeroed pages are zeroed on demand during allocation. (Zeroing can be theoretically skipped in cases where the kernel knows the full page will be written to before any reads)

Zeroing on exit would be more secure, but significantly slower -- you want to exit quickly, so you can potentially start a replacement program, which would be expected to, at least sometimes, take time to allocate the same amount of memory. If it does allocate the whole amount immediately, it's not necessarily any slower in total time between zeroing at exit or on mapping; but it there's enough time for the pages to get zeroed in the background, that reduces the amount of time waiting for the kernel to do things.

Maybe a randomized sparse zeroing ?
I'm not sure what a randomized zeroing would get you from a security perspective. You shouldn't need to be concerned about other programs observing the memory, kernels are expected to give programs only zeroed pages. If you're concerned about kernel level memory dumping, randomized zeroing isn't good enough -- it may or may not have zeroed your secrets, so that's not very helpful. Background zeroing doesn't help much here either -- FreeBSD sets a target of zeroing half the free pages, so your secrets may not be zeroed for a long time.

It seems the jury is out on the benefits from a performance perspective (DragonflyBSD took out background zeroing, saying they were unable to observe a performance difference, so simpler code is better)

Not much indeed but it might deter some low hanging leak hacks..
Why? When taking into account the cpu cache, branch mispredictions, etc, I bet it would be slower than just zeroing it, besides it wouldn't be secure at all, imagine a process that stores a secret key, and then releases the memory, if another process can trigger the first to generate and release the key memory multiple times, they would be able to read it.
Marks it as free, but the OS will wipe it before giving it to another process.
Your example combined with the parents observation show that C++ put under the same construct the concepts that should be separated: memory allocation should be handled differently from the construction, destruction and other resource allocation.
Memory allocation and deallocation on the heap basically mean calling the `operator new` and `operator delete` functions in C++. The language provides a default implementation but you can override it.

Constructors are orthogonal. The job of a constructor is to construct your object given that the space for the object is already allocated. This could be on the stack, where allocation means bumping the stack pointer, or in-place in preallocated storage (like std::vector), or the result of calling `operator new`. Simply using the `new` syntax does both as a shorthand.

Similarly the job of a destructor is to destruct your object without deallocating it. One can in-place destruct without deallocating, or destruct and then deallocate implicitly when the stack pointer is adjusted, or not at all. The `delete` syntax does both destruction and deallocation as a convenience.

Memory allocation/deallocation in C++ is handled separately from construction and destruction. The delete and new syntax is short hand for combining the two.
> memory allocation should be handled differently from the construction, destruction and other resource allocation.

These other resources still need an in-memory representation to track and reference resources, so you can't really separate them.