| HN Mirror

That's totally fine, but benchmarks are like standardized tests like the SAT. They measure something and it totally makes sense that each release bests the prior in the context of these benchmarks.

It may even be the case that in measuring against the benchmarks, these product teams sacrifice some real world performance (just as a student that only studies for the SAT might sacrifice some real world skills).