|
|
|
|
|
by weinzierl
350 days ago
|
|
Is your code really fast if you haven't measured it properly? I'd say measuring is hard but a prerequisite for writing fast code, so truly fast code is harder. The number
one mistake I see people make is measuring one time and taking the results at face value. If you do nothing else, measure three times and you will at least have a feeling for the variability of your data. If you want to compare two versions of your code with confidence there is usually no way around proper statistical analysis. Which brings me to the second mistake. When measuring runtime, taking the mean is not a good idea.
Runtime measurements usually skew heavily towards a theoretical
minimum which is a hard lower bound. The distribution is heavily lopsided with a long tail. If your objective is to compare two versions of some code, the minimum is a much better measure than the mean. |
|
You'll see this in any properly active online system. Back in the previous job we had to drill it to teams that mean() was never an acceptable latency measurement. For that reason the telemetry agent we used provided out-of-the-box p50 (median), p90, p95, p99 and max values for every timer measurement window.
The difference between p99 and max was an incredibly useful indicator of poor tail latency cases. After all, every one of those max figures was an occurrence of someone or something experiencing the long wait.
These days, if I had the pleasure of dealing with systems where individual nodes handled thousands of messages per second, I'd add p999 to the mix.