| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dignan 1955 days ago

GC is a memory management technique with tradeoffs like all the others.

GC has many different implementations, with widely ranging properties. For example, the JVM itself currently supports at least 3 different GC implementations. There are also different types of GC's, so for example in a generational garbage collection system you'll typically see two or three generations of GCs, depending on the generation (how many GC cycles it has survived) of the objects it collects. The shortest GC's in those systems are usually a couple milliseconds, while the longest ones can be many seconds.

GC isn't always a problem. If your application isn't latency sensitive, it's not a big deal. Though if you tune your network timeouts to be too low, even something that is not really latency sensitive can have trouble because of GC causing network connections to timeout. Even if it is a latency sensitive applicatoin, if GC "stop the world" pauses - pauses that stop program execution, are short it can be OK.

One reason you'll see people say GCs are bad is for those latency sensitive applications. For example, I previously worked on distributed datastores where low latency responses were critical. If our 99th percentile response times jumped over say 250ms, that would result in customers calling our support line in massive numbers. These datastores ran on the JVM, where at the time G1GC was the state of the art low-latency GC. If the systems were overloaded or had badly tuned GC parameters, GC times could easily spike into the seconds range.

Other considerations are GC throughput and CPU usage. GC systems can use a lot of CPU. That's often the tradeoff you'll see for these low-latency GC implementations. GC's also can put a cap on memory throughput. How much memory can the GC implementation examine with how much CPU usage with what amount of stop-the-world time tends to be the nature of the question.