Hacker News new | ask | show | jobs
by kllrnohj 410 days ago
> Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. [...] Zig for example does it right by having explicit allocators from the start

Rust picked the right default for applications that run in an OS whereas Zig picked the right default for embedded. Both are good for their respective domains, neither is good at both domains. Zig's choice is verbose and useless on a typical desktop OS, especially with overcommit, whereas Rust's choice is problematic for embedded where things just work differently.

1 comments

Various kind of "desktop" applications like databases and video games use custom non-global allocators - per-thread, per arena, etc - because they have specific memory allocation and usage patterns that a generic allocator does not handle as well as targeted ones can.

My current $dayjob involves a "server" application that needs to run in a strict memory limit. We had to write our own allocator and collections because the default ones' insistence on using GlobalAlloc infallibly doesn't work for us.

Thinking that only "embedded" cares about custom allocators is just naive.

> Thinking that only "embedded" cares about custom allocators is just naive.

I said absolutely no such thing? In my $dayjob working on graphics I, too, have used custom allocators for various things, primarily in C++ though, not Rust. But that in no way makes the default of a global allocator wrong, and often those custom allocators have specialized constraints that you can exploit with custom containers, too, so it's not like you'd be reaching for the stdlib versions probably anyway.

I don't see why you would have to write your own - there are plenty of options in the crate ecosystem, but perhaps you found them insufficient?

As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.

95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.

> there are plenty of options in the crate ecosystem

Who writes the crates?

That's public information. It's up to you to make the choice whether to trust someone, but the least you can do is look at the code and see if it matches what you would have done.
The allocator we wrote for $dayjob is essentially a buffer pool with a configurable number of "tiers" of buffers. "Static tiers" have N pre-allocated buffers of S bytes each, where N and S are provided by configuration for each tier. The "dynamic" tier malloc's on demand and can provide up to S bytes; it tracks how many bytes it has currently allocated.

Requests are matched against the smallest tier that can satisfy them (static tiers before dynamic). If no tier can satisfy it (static tiers are too small or empty, dynamic tier's "remaining" count is too low), then that's an allocation failure and handled by the caller accordingly. Eg if the request was for the initial buffer for accepting a client connection, the client is disconnected.

When a buffer is returned to the allocator it's matched up to the tier it came from - if it came from a static tier it's placed back in that tier's list, if it came from the dynamic tier it's free()d and the tier's used counter is decremented.

Buffers have a simple API similar to the bytes crate - "owned buffers" allow &mut access, "shared buffers" provide only & access and cloning them just increments a refcount, owned buffers can be split into smaller owned buffers or frozen into shared buffers, etc.

The allocator also has an API to query its usage as an aggregate percentage, which can be used to do things like proactively perform backpressure on new connections (reject them and let them retry later or connect to a different server) when the pool is above a threshold while continuing to service existing connections without a threshold.

The allocator can also be configured to allocate using `mmap(tempfile)` instead of malloc, because some parts of the server store small, infrequently-used data, so they can take the hit of storing their data "on disk", ie paged out of RAM, to leave RAM available for everything else. (We can't rely on the presence of a swapfile so there's no guarantee that regular memory will be able to be paged out.)

As for crates.io, there is no option. We need local allocators because different parts of the server use different instances of the above allocator with different tier configs. Stable Rust only supports replacing GlobalAlloc; everything to do with local allocators is unstable, and we don't intend to switch to nightly just for this. Also FWIW our allocator has both a sync and async API for allocation (some of the allocator instances are expected to run at capacity most of the time, so async allocation with a timeout provides some slack and backpressure as opposed to rejecting requests synchronously and causing churn), so it won't completely line up with std::alloc::Allocator even if/when that does get stabilized. (But the async allocation is used in a localized part of the server so we might consider having both an Allocator impl and the async direct API.)

And so because we need local allocators, we had to write our own replacements of Vec, Queue, Box, Arc, etc because the API for using custom A with them is also unstable.

Did you publish these by any chance?
Sorry, the code is closed source.