|
> Also, in this case, the amount of required memory is less significant during peak times due to some optimizations to reduce the memory consumed. Pretty sure it’s the same: because of collisions, maps always have a certain fraction of empty slots (much more so than vectors). I’ve not heard of Go maps being very high load factors, so they probably resize around 50% or so full. I didn’t check any of the numbers, I’ll assume 50% load factor, doubling on resize, and entries of {hash, key, value} with a word-sized (8 bytes) hash, but some and likely all of those are likely off. Anyway with a load factor of 50% and doubling at any point you’d have 2n to 4n slots in the map total (depending whether you’re nearing resize or just went through one). And thus if you make entries smaller, you make all those overhead slots smaller as well. hash + int + blob would be 8+8+128 144 bytes, a million of those is 144MB, x2 is 288 and x4 is 576, so the observed bounds are pretty close: IIRC Go maps use some form of separate chaining (though not “trivial” linked lists), I assume those do get reclaimed as entries are removed even if the map itself does not get shrunk, which would explain why memory consumption goes down as elements are removed from the initial map. With a pointer each “entry” is 24 bytes instead, for a map overhead of 48~96MB (so clearly Go maps use either a higher load factor or a more efficient storage than the trivial scheme being assumed here, possibly both). The actual data accounts for 128M plus some, there’s a 144M difference between the full and the emptied pointer-maps, which would account for some chain links being removed, and maybe some object headers / allocator overhead for the on-heap arrays. |
Go uses a load factor of 6.5 (expressed as average load of a bucket; where buckets have a maximum load of 8; i.e. 81% full) (ref: https://github.com/golang/go/blob/go1.19.3/src/runtime/map.g... )
> I’ll assume … doubling on resize
Indeed (ref: https://github.com/golang/go/blob/go1.19.3/src/runtime/map.g...)