Hacker News new | ask | show | jobs
by molodec 2200 days ago
Specific workload matter a lot. I had a good experience with Shenandoah collector on an application that generates very few intermediate objects, but once an object is created it stays in the heap for a while ( a custom made key/value store for a very specific use case). Shenandoah collector was the best in terms of throughput and memory utilization. Most collectors are generational, so surviving objects have to be moved from Eden to Survivor to Old. Shenandoah is not generational, and I suspect it has less work to do for objects that survive compare to other collectors. When most objects live long enough generational collectors hinder performance.
2 comments

In the case of Hazelcast Jet and similar products, loads of young garbage are unavoidable because it comes from the data streaming through the pipeline. A generational GC should in principle get a great head start in this kind of workload, and our benchmarks have confirmed it.
Yep, workload matters. Generational garbage collectors are fundamentally at odds with caching/pooling of objects. They are based on the assumption that objects die young. Typically that is not the case for internal caches, though. Caches usually consist of long-living/tenured objects.
It is a stretch to claim caching is fundamentally at odds with GC. It is more correct to say that LRU breaks the generational hypothesis, because it prioritizes new entries which take a long time to be evicted. However many workloads are frequency biased and these one-hit wonders degrade the hit rate. That is why you'll see more aggressive eviction in a modern policy, so you'll have better GC behavior and higher hit rates using something like Java's Caffeine library.
Keep in mind that it's not fundamental. Generational GCs just make a bet that you can save a lot of effort by segregating the objects by age. In almost all Java workloads there's plenty of short-lived objects, and a generational GC takes care of them at an especially low cost. The price to pay for that is pretty low, basically it's the overhead of card marking (a write barrier is needed) and subsequent partial scanning of the Old Generation if there are many references from old to new objects.

Only very specialized workloads won't create much short-lived objects, and for those cases there are alternative non-generational GCs on the JVM (Z, Shenandoah).