|
|
|
|
|
by ninkendo
3032 days ago
|
|
What about the requests that are processed by a core that's doing a GC? Wouldn't that cause a higher P99 latency exactly as you'd expect? Spending 2.5% of cycles on GC doesn't mean those cycles are perfectly distributed. The distribution of GC work onto cores is bound to have some cores doing more work than others, which would (I would think) manifest itself as some requests that land on an unlucky core getting more latency. Isn't it totally expected that this would cause a P99 latency spike? |
|
And you could also use more CPU cores than request workers, that way you will always have spare core capacity and thus your latency will not be directly impacted. That is if you really really value latency more than throughput.
Again, my main point is that throughput and latency are not the same thing. There is some relation in so far that you cannot fulfill latency promises if your throughput is insufficient and your queues start filling up. But below the saturation point it's a lot more complicated, especially in parallel systems with bursty arrivals.