Hacker News new | ask | show | jobs
by the8472 3026 days ago
No, that would be an incremental GC working in very small time slices.

A concurrent GC spends CPU cycles on different cores to do its work, which means it will not cause latency outliers in the threads processing the requests. They are still CPU cycles you don't have to serve other requests, hence they still affect throughput.

That is a simplified explanation of course, there are a lot of caveats.

In my original post I was mostly speaking about the measurement though, since they are measuring throughput when they are concerned about latency, those are somewhat related but depending on circumstances only weakly so.

2 comments

What about the requests that are processed by a core that's doing a GC? Wouldn't that cause a higher P99 latency exactly as you'd expect?

Spending 2.5% of cycles on GC doesn't mean those cycles are perfectly distributed. The distribution of GC work onto cores is bound to have some cores doing more work than others, which would (I would think) manifest itself as some requests that land on an unlucky core getting more latency. Isn't it totally expected that this would cause a P99 latency spike?

Maybe if you operated your system at the saturation point, which you really don't want to do in practice. Instead you want your queues to be mostly empty. Bursts are inevitable but bursts coinciding with GC doing work hopefully is a beyond 99th percentile thing. Of course if we're speaking purely theoretical we could also assume spherical cows in a vacuum and say that requests don't burst and simply arrive in a metronome-like trickle and then the spikes evaporate too. This is basically queuing theory.

And you could also use more CPU cores than request workers, that way you will always have spare core capacity and thus your latency will not be directly impacted. That is if you really really value latency more than throughput.

Again, my main point is that throughput and latency are not the same thing. There is some relation in so far that you cannot fulfill latency promises if your throughput is insufficient and your queues start filling up. But below the saturation point it's a lot more complicated, especially in parallel systems with bursty arrivals.

> A concurrent GC spends CPU cycles on different cores to do its work, which means it will not cause latency outliers in the threads processing the requests.

That makes sense, how about GC for single-threaded languages, e.g. Nodejs?

Just because a language is single threaded in the code you write doesn't mean it doesn't use threads behind the scenes, like for GC or other things.