| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by google2342 2550 days ago
	You shouldn't use the mean when doing benchmarking. Better to use the median or fastest time. Lots of random things can happen on computers (usually in the OS) that can result in some operation taking 1000x longer.

2 comments

chronial 2550 days ago

Interesting point. The %timeit functionality actually used to output the mean of the x fastest runs, they seem to have changed that at some point. The docs still explain the old behavior [1].

I assume that they feel your concerns are addressed because they display the stddev together with the mean, so you can see if there were any extreme outliers.

[1] https://ipython.org/ipython-doc/dev/interactive/magics.html#...

link

jcranmer 2550 days ago

If you're testing a single-threaded benchmark, then the test statistics aren't going to be meaningfully different in interpretation, especially if you're only asking the question "is A or B faster?" What's more important is that you capture enough runs to characterize the distribution well; if you have that, you'll get meaningful results no matter which statistic you're actually measuring.

link