Hacker News new | ask | show | jobs
by suresk 1033 days ago
It's one of those things where you usually have to let profiling and other observations guide your approach. 99.9% of the time it doesn't really matter and the default behavior is fine. But I can think of a few times where this has been a big deal.

One in particular - I was profiling an application with low-latency needs and GC was taking up a ton of time. Mission control showed tons of allocations of arrays - at one point it was creating a bunch of lists in a loop and adding stuff to them, which triggered creating a new underlying array. We found that a) Many of the arrays were just over the first resizing size, and b) There was a good heuristic that we could use to give them an initial size that would never have to be expanded and wouldn't result in huge amounts of waste.

This had a pretty dramatic effect on our GC times and the overall latency. I think this is where the JVM really shines - tons of tooling to help you profile and observe these kinds of details to help you figure out when you actually need to care about stuff like the initial array capacity.

1 comments

Depends a lot on what you're doing too. I do a fair bit of heavy data processing work with my search engine (tokenizing something like a billion documents into arrays of words etc), and allocator contention has a pretty huge performance impact for that type of work.

My intuition is that the best thing is to aim for the expected median size, rather than the maximum as one might assume would be the most performant. The maximum strategy minimizes re-allocations, but at the expense of always making costlier allocations.

I think it depends a lot on the other details, especially how expensive the extra GC will be vs the wasted space. Hard to give a rule that will work in all contexts.

In our case, it wasn't a single hard-coded number - the input data gave us the upper bound, and the difference between the upper bound and the median case was so small that going with the upper bound worked out best.