Hacker News new | ask | show | jobs
by twic 1325 days ago
FWIW i am pretty sure Java's HashMap has the same behaviour - it grows the table, but never shrinks it. Even if you call .clear(), it just clears out the table, rather than throwing the table away.

I imagine there are lots of scenarios in which this is what you want, because after emptying the map, you're going to re-fill it, and it saves reallocating the table. But it would be frustrating in a scenario when that isn't what you want.

If a map has this behaviour, i would say that the most important thing is that it should be clearly documented (Java's isn't). The second most important thing is that there should be a way to get round it - either a .clearHarder() method which throws away the table, or a .compact() method which downsizes it while retaining the content.

6 comments

Rust uses "shrink_to_fit()". Personally never had to use it, but you always end up scrolling by the backing allocation management for all the standard collections when looking through the docs.

> Shrinks the capacity of the map as much as possible. It will drop down as much as possible while maintaining the internal rules and possibly leaving some space in accordance with the resize policy.

https://doc.rust-lang.org/std/collections/struct.HashMap.htm...

And the docs makes it clear that "clear()" only removes the elements. Giving a hint of where to go next if you stumble upon the issue in the OP.

> "Clears the map, removing all key-value pairs. Keeps the allocated memory for reuse."

https://doc.rust-lang.org/std/collections/struct.HashMap.htm...

Wouldn't reallocating the map be very cheap when running in a JVM? Yes, it isn't something to do in the hot path, but surely it should be faster than a direct malloc(), right? I'm being very naive and ready to be proven wrong.
The main difference with Go is that the table of a HashMap in Java is a table of pointers, not a table of structs, so the overhead to not shrink the table is less an issue.
You're right that Java's Hashmap only ever resizes upwards - the most common use case. However, if there's a need to remove that memory after clearing() then the typical use case is to do something that allows the GC to sort it.

E.g.,

    var map = new HashMap<Bla, Bla>();
    // Many things added to map
    // Many but not all things removed from map
    // to shrink it to size of remaining items
    map = new HashMap<>(map);
Yeah, it's not nice, but the people who wrote the JDK are far smarter than me, so I figure they optimised for the 99% use-case, not the 1%.
I think that's exactly what TFA is proposing by recreating the map

    m := make(map[Bla]Bla)
    // add to map
    // mark map as ready for GC
    m = make(map[Bla]Bla)
>The second most important thing is that there should be a way to get round it - either a .clearHarder() method which throws away the table, or a .compact() method which downsizes it while retaining the content.

Wouldn't just creating a new map be a pretty good way of doing it? If the reduction in size is significant, then the cost of the new allocation will be small compared to the cost of all the allocations you've done to build the original map.

> If a map has this behaviour, i would say that the most important thing is that it should be clearly documented

Iirc, most hash table implementations don't automatically shrink. More documentation is always nice, but isn't not automatically shrinking kind of the expected "default" behavior?