Hacker News new | ask | show | jobs
by jraph 459 days ago
Reading that you are extremely embarrassed and sorry that you made such a huge oversight, I was imagining you had broken something / worsened CPython's performance.

But it's nothing like this. You announced a 10-15% perf improvement but that improvement is more like 1-5% on a non buggy compiler. It's not even like that 10-15% figure is wrong, it's just that it's correct only under very specific conditions, unknowingly to you.

IIUC, you did your homework: you made an improvement, you measured a 10-15% perf improvement, the PR was reviewed by other people, etc. It just so happens that this 10-15% figure is misleading because of an issue with the version of clang you happened to use to measure. Unless I'm missing something, it looks like a fair mistake anyone could have reasonably made. It even looks like it was hard to not fall into this trap. You could have been more suspicious seeing such a high number, but hindsight is 20/20.

Apparently, you still brought significant performance improvements, your work also helped uncover a compiler regression. The wrong number seems quite minor in comparison. I wonder who was actually hurt by this. I only discover the "case" right now but at a first glance it doesn't feel like you owe an apology to anyone. Kudos for all this!

2 comments

In some way, by indirectly helping fix this bug, they led to a ~10% performance increase for everyone who was using that faulty compiler! That's even better than an optional flag that many people won't know about or use.
That performance regression only hit code that was using a very large number of paths with the same table of computed gotos at the end. That's likely to only be relatively complex interpreters that were affected. So it's not a broad performance improvement. But it is nice to have an example of the compiler's new heuristic failing to prove evidence it needs to be tunable.
Well, that includes at least everyone using Python built with that compiler.
> IIUC, you did your homework: you made an improvement, you measured a 10-15% perf improvement, the PR was reviewed by other people, etc. It just so happens that this 10-15% figure is misleading because of an issue with the version of clang you happened to use to measure. Unless I'm missing something, it looks like a fair mistake anyone could have reasonably made. It even looks like it was hard to not fall into this trap. You could have been more suspicious seeing such a high number, but hindsight is 20/20.

Hah! Is this a Gettier problem [0]?

1. True: The PR improves Python performance 15-20%. 2. True: Ken believes that the PR improves Python performance 15-20%. 3. True: Ken is justified in believing that the PR improves Python performance 15-20%.

Of course, PR discussions don't generally revolve around whether or not the PR author "knows" that the PR does what they claim it does. Still: these sorts of epistemological brain teasers seem to come up in the performance measurement field distressingly often. I wholeheartedly agree that Ken deserves all the kudos he has received; still, I also wonder if some of the strategies used to resolve the Gettier problem might be useful for code reviewers to center themselves every once in a while. Murphy's Law and all that.

[0]: https://en.wikipedia.org/wiki/Gettier_problem

Could very well be!

Interesting, I didn't know about the Gettier problem, thanks for sharing. You could try submitting that page as a proper HN post.