Hacker News new | ask | show | jobs
by 10000truths 529 days ago
I recommend against using linked lists for bookkeeping the free blocks. It seems to be the data structure that every malloc/free implementation reaches for, and I don't know why - the slowness of pointer-chasing makes it terrible for almost any real-world use case. A balanced tree would be a much better idea, given that all the essential operations would take O(log n) time instead of O(n). Even if one insists on a linear search, a bitset is much more cache friendly than pointer-chasing and it can trivially benefit from SIMD optimizations.
6 comments

For the `Chunk` list, this isn't one of the cases where linked lists are harmful. Each use only touches the top of the stack, never iterates. Also, linked lists are much easier to make thread-safe.

For the `LinkedPtr` list, the bad case is only hit during the destruction of the pool, and then only if you allocate a lot of memory. And given the overhead of deallocation I'm not sure the array approach would measurably help.

I don't see anywhere a binary tree search would be useful here, since there are no loops used for lookup (on allocation they're pushed in order, but when freed chunks are pushed in arbitrary order; this does mean no double-free protection).

why would you need to iterate on destruction? you free the whole pool at once, since you allocate it all at once
The reason linked lists are used is for large enough allocations, there is no overhead. You use the space the application isn't using. In addition, if all allocations are the same size it is O(1), you just look at the head of the list.

More sophisticated strategies bucket allocations by size, this has fixed overhead. You can also use balanced trees for more memory efficiency, but this is slower.

For small allocations (8 bytes) that are too small to contain pointers allocators will allocate a block and use bitsets.

Pool allocators don't walk the list or search anything though. All interactions are only at the list head and O(1), as all free nodes are just that, free and equal.
Then it might be a misnomber. Calling it "stack" instead of "list" might be better.
That is a fair comment.
While the most precise naming might be "a singly linked list representation of a stack", "free list" (https://en.wikipedia.org/wiki/Free_list) is an ancient term of art for this problem.. so much so that wikipedia (at least the version today) even suggests "freelist" all one word as a spelling.

The initial linking together of all free space (also cache friendly, btw) is often called "threading the free list", although this terminology is less standard than "free list" itself.

Suppose the information did happen to search through the free blocks. Suppose you put them into an array instead of a linked list. They can't actually be in the array, you see? The blocks are in the pool heap wherever they happen to be. So the array has to point to them. As you walk through the array you have to do reference the pointers. The only way it's better is that they are not dependent loads. But you have this array to manage now.
Which pool operation is made more costly by the linked list? Both allocate and free are constant time, and trivial too.

The only thing that I can imagine being faster is a bump allocator, but that precludes free.

I don't think you understand how the allocator in the article works. Allocating and freeing are already O(1), creating and closing the allocator are necessarily O(n). There is no search being done here.