|
|
|
|
|
by whizzter
640 days ago
|
|
I'd like to see a reference to some GC actually doing explicit optimizations of this kind in a multithreaded scenario (not just as an indirect effect of theuir regular scanning), more or less linear scanning of memory is a natural consequence of a moving GC. The Java gc's are doing some crazy stuff, my point however was that the acyclic nature of the Erlang object graph enables them to do fairly "simple" optimizations to the GC that in practice should remove most need for pauses without hardware or otherwise expensive read barriers. It doesn't have to do a lot of things to be good, once you have cycles you need a lot more machinery to be able to do the same things. |
|
When it comes to Java - it has multiple GC implementations with different tradeoffs and degree of sophistication. I'm not very well versed in their details besides the fact that pretty much all of them are quite liberal with the use of host memory. So the way I approach it is by assuming that at least some of them resemble the GC implementation in .NET, given extensive evidence that under allocation-heavy scenarios they have similar (throughput) performance characteristics.
As for .NET itself, in server scenarios, it uses SRV GC which has per-core heaps (the count is sizing is now dynamically scalable per workload profile, leading to much smaller RAM footprint) and multi-threaded collection, which lends itself to very high throughput and linear scaling with cores even on very large hosts thanks to minimal contention (think 128C 1TiB RAM, stometimes you need to massage it with flags for this, but it's nowhere near the amount of ceremony required by Java).
Both SRV and WKS GC implementations use background collection for Gen2, large and pinned object heaps. Collection of Gen0 and Gen1 is pausing by design as it lends itself for much better throughput and pause times are short enough anyway.
They are short enough that modern .NET versions end up having better p99 latency than Go on multi-core throughput saturated nodes. Given decent enough codebase, you only ever need to worry about GC pause impact once you go into the territory of systems with hard realtime requirements. One of the better practical examples of this that exists in open source is Osu! which must run its game loop 1000hz - only 1ms of budget! This does pose challenges and requires much more hands-on interaction with GC like dynamically switching GC behaviopr depending on scenario: https://github.com/dotnet/runtime/issues/96213#issuecomment-... This, however, would be true with any language with automatic memory management, if it's possible to implement such a system in it in the first place.