Hacker News new | ask | show | jobs
by yongjik 1587 days ago
This is probably too involved to be a question for your interviewer, but I kinda want to ask, "Imagine you have multiple instances of a production service with different 90% latencies, how would you measure the overall latency of the service?"

A surprisingly large number of SWEs are really bad at statistics, which is kinda necessary when you're asking whether a service is fast enough.

1 comments

What's your answer? I'm tempted to average them?
You can't average percentiles - it's not mathematically meaningful, so your number will move depending on random details. E.g., if you changed your load balancing algorithm, and then observed the number go down, you can't conclude that latency was improved, because you changed how the requests are aggregated and it affects the result in random directions.

(You may get a usable number if all servers are more-or-less equivalent and events are randomly distributed, but then you're basically assuming the thing you want to validate.)

I'm not an expert but I know two correct ways: collect the records of all individual requests and compute 90% just once on the whole set (doable if there aren't too many requests - modern machines are quite powerful), or generate per-server histograms which can be merged safely.