Personally, I prefer looking at something more like 95th percentile latency, versus average, which is what I think this article is showing. I suppose a histogram would give you the fullest picture, though.
> We measure latency for 10% of the requests, and plot each of these latencies individually on the graphs.
So for what it's worth these spikes may very well be single requests that are not relevant and are only triggered by the way the Kubernetes cluster was being manipulated for the test.
The spikes aren't single requests (at 1000 RPS, the spikes are well over a second long and you see hundreds of requests that spike). As for the reason, we suspect that the different config reload mechanisms in the respective proxies is what triggers the spikes.
> We measure latency for 10% of the requests, and plot each of these latencies individually on the graphs.
So for what it's worth these spikes may very well be single requests that are not relevant and are only triggered by the way the Kubernetes cluster was being manipulated for the test.