|
|
|
Ask HN: Anyone else seeing people averaging percentiles?
|
|
15 points
by stat_throwaway
1734 days ago
|
|
I work at a middling SV company, and I see people taking averages of percentiles (or, even crazier, percentiles of percentiles) every day - that is, each server computes its own "50%/90%/95% latency" of all requests, sends it to a central time-series database, and the Grafana console averages them all to show a "nice" graph. They are used for everything from alerts to launch decisions. It's driving me crazy because that makes no sense: you can't average percentiles, you will get bogus numbers that randomly jumps up and down based on aggregation (how many servers you have, which tag sets you use, and so on). And I'm apparently the only one who is seriously bothered. Everyone else is somewhere between "Eh, that's the best data we have." and "What do you mean the numbers are wrong?" Is this normal? |
|
On the other hand, don't we average things that we can't technically justify all the time? Teachers give out homework and tests, assign a weight to each homework and each test, and average the results to assign you a grade. Is that grade arbitrary? Yes. Does that mean it is useless? Probably not.
If I were you, I would make your case based less on "you can't do that" and more on "if we used this approach to aggregation, we would improve our ability to detect cases X and Y that our customers really care about".