| The worst part about this: > Running experiments until you get a hit Is that it's literally what us software optimization engineers do. We keep writing optimizations until we find one that is a statistically significant speed-up. Hence we are running experiments until we get a hit. The only defense I know against this is to have a good perf CI. If your patch seemed like a speed-up before committing, but perf CI doesn't see the speed-up, then you just p-hacked yourself. But that's not even fool proof. You just have to accept that statistics lie and that you will fool yourself. Prepare accordingly. |
I don't think that is what it is saying. It is saying you would write one particular optimization (your hypothesis), and then you would run the experiment (measuring speed-up) multiple times until you see a good number.
It's fine to keep trying more optimizations and use the ones that have a genuine speedup.
Of course the real world is a lot more nuanced -- often times measuring the performance speed up involves hypothesis as well ("Does this change to the allocator improve network packet transmission performance?"), you might find that it does not, but you might run the same change on disk IO tests to see if it helps that case. That is presumably okay too if you're careful.