Hacker News new | ask | show | jobs
by jamboca 1163 days ago
"The specific cause for the fragmentation could be any number of things: JSON parsing with serde, something at the framework-level in axum, something deeper in tokio, or even just a quirk of the specific allocator implementation for the given system. Even without knowing the root cause (if there is such a thing) the behavior is observable in our environment and somewhat reproducible in a bare-bones app."

So what is the ultimate cause of the fragmentation in this case? You can blame your allocator implementation, but how do you know it's not your use of the allocator? It feels like you are just slapping jemalloc on to solve this, which I suppose shows that the allocator you were using was causing problems before, but it doesn't really explain how? I suppose what I'm wondering is why specifically the allocator you were using before was causing this fragmentation... isn't that a bigger problem than you're making it sound?

Also, what else can an allocator do other than coalescing free blocks to decrease fragmentation? Does it involve occasional checks to defragment the heap ie moving separated blocks so they are adjacent?

2 comments

Our assumption, which turned out to be true, is that it's due to the JSON parsing code. Rust (serde) is very efficient with parsing JSON to a predefined structure, but when it comes to parsing to a "generic object", which we need for part of the payload, it's not as much. We are going to deploy a full fix for this issue too, but jemalloc already solved it as well.

Though I disagree with saying it's "just slapping jemalloc on to solve this". The piece of code in question definitely made the fragmentation issue worse, as it was making a lot of allocations of varying sizes, but the underlying issue of memory fragmentation because of the allocator was still there, and it would have just triggered later by a different code path.

Heap fragmentation often comes from allocating objects with different lifetimes at the same time on the same pages. Parsing is a common case of this because you allocate the whole object tree then only keep some of it.
Not in the context of an HTTP server. As we just parse, use it in the request, and then return (freeing all the memory). I think the problem is because we have multiple requests being handled in tandem and Rust doesn't know it's probably better off allocating all of the data together and then freeing this big block.

That's what's nice about jemalloc, it has a more generic algorithm for reusing allocated blocks.

Well, it's good as a drop in solution but an arena allocator or malloc zones would really be best.
The gold standard for avoiding fragmentation is what jemalloc does, that is, only allocating objects of similar size from a chunk of memory. That is, instead of a single global heap there exists a pool for every valid size of object (and to keep the numbers low, object sizes are rounded up to some set of buckets).

This means that there is more memory wasted for small programs, but as memory use grows the wastage caused by this remains constant and allocation and deallocation will always remain fast.

This isn't good enough because size of an allocation says nothing about what its lifetime is. If you know lifetimes or types then you can segregate those and it does help.

(It does help in that if you have fixed size slabs, you can't waste space on that page, but you can still waste the entire page.)