Hacker News new | ask | show | jobs
by valarauca1 2806 days ago

    Bump allocation is efficient and fast in single threaded
    programs but almost all Go applications are multi
    threaded
This operation doesn't require locks, just an atomic add, and a branch to ensure you aren't allocating past the end of the current "arena" where memory is being allocated. Trivial optimizations are providing thread-local allocation arenas which remove the need for longer pauses as locality is improved (less cache coherence protocol work for the silicon).

OFC these schemes require some kind of relocation, but they make allocating blindingly fast. The only way you get faster is by pre-faulting the arena, and hinting for the _next_ chunk of the arena to be loaded in L1/L2 cache.