Hacker News new | ask | show | jobs
by eeadc 4638 days ago
The fact that returning memory to the Kernel is hard is supported by the circumstance, that most allocators will use brk/sbrk to resize the data segment of the executing process to allocate memory, at least if they shall allocate few memory.

The other fact, that allocators have to lock global data structures is also not true. Most modern operating systems supports thread-local storage and therefore you don't need locking because you can keep much per-threads allocators, and only if you want to release memory of a foreign thread you have to lock (but that's also bad practice in most cases).

Therefore, this article is great if your horizon end at the default allocators tcmalloc, ptmalloc and jemalloc, but the reality is much more complex. The fact that such a thing doesn't exists isn't founded in the fact that it's hard to implement, it's founded in the fact that there is no need for such an allocator, because most well-written software will allocate large chunks of memory.

3 comments

> (...) that most allocators will use brk/sbrk to resize the data segment of the executing process to allocate memory, at least if they shall allocate few memory.

The other commonly used backend for malloc() is mmap() without underlying file:

  void *chunk = mmap(NULL, length, PROT_READ|PROT_WRITE, MAP_ANONYMOUS, -1, 0);
Handy both when allocating large chunks of memory and for allocating pools for smaller suballocations. Has the additional benefit of being zeroed-out at low cost (or no cost at all -- for example via hardware DMA), and also playing nice on systems with constrained / fragmented address space, as kernel is free to allocate at any address visible from userspace.
> Most modern operating systems supports thread-local storage and therefore you don't need locking because you can keep much per-threads allocators

The article explicitly points this out, and points out the problem with it: It means wasting memory on per-thread pools, and the more threads you use, the larger the pools needs to be if you want to prevent contention, compounding the problem.

> because most well-written software will allocate large chunks of memory.

1. Most software is not well written.

2. Most large pieces well written pieces of software that allocates only large chunks of memory has some custom allocator of some sort (or horrible abuses of arrays) embedded somewhere to work around exactly the problems noted in the article. In many cases people end up wasting time writing the same types of specialised allocators over and over.

I've seen plenty of large C and C++ apps that'd have benefitted greatly from a simple arena allocator for example... And I have also seen countless of implementations of arena allocators and various pool allocators and tons of other variations.

In other words: These things do exist. They're common, to the point where they're often covered in books on C/C++. Especially for C++ where there is specific built in (though weak) support for custom allocators.

> because most well-written software will allocate large chunks of memory.

The average string length in most programs is about 5 to 10 bytes. Plenty of well written software works with strings like that.