Hacker News new | ask | show | jobs
by jerf 1251 days ago
One of my favorite things is "the thing that sounds obvious when I say it, but you didn't think of it before". Here's one related to benchmarking: For A to be 120x better than B in a comparable task, that has to mean that B is leaving that much performance on the table in the first place.

Now, let's combine this with one of the persistent tendencies of developers to take one specific benchmark as indicative of the overall performance, which is often preyed on by benchmarkers trying to sell things.

Is it really plausible that neo4j takes 120x longer than it needs to on all operations? A dedicated graph database that has been tuned and optimized for that task for quite a while now?

I'm not quite going to rate that a 0 probability, but it's definitely a very big claim. While the probability is not 0, it is comfortably below "someone's gaming the numbers" and "the benchmark is not as comparable as claimed". There's a faint chance the latter may match a production use case; for instance, certain comparisons of NoSQL DBs and SQL DBs are "not fair" in that they won't be doing remotely the same things for the queries and the performance landscape is very complicated, with one side winning handily for some tasks and the other side handily winning for others, but if your use case falls into one of those big wins you may not care about the "fairness". But it's still a pretty big chunk of probability mass that it's just plain not comparable; how many times have we seen a ludicrous benchmarking claim of relative superiority just for the losing side to pop up and say something to the effect of "Hey, did you consider adding the correct index to the data, oh look if you do that we win by a factor of 4."

Tell me you're 1.2x or 1.5x faster or something, or that your clever compression means I can remove 1/3rd of my systems or something. Keep it in the range of plausible.

While I'm sure this won't affect the marketing of this company any, ludicrously large claims of 10x+ speed improvements actually turn me off, not attract me. You'd better have some sort of super compelling reason why you somehow managed to be that fast over your competitor, like, "we're the first to successfully leverage GPUs" or something like that. Otherwise I'm going to guess "Actually, you have an O(log n log log n) algorithm over their O(log n log n) algorithm and you cranked the data set up to the ludicrous sizes it takes to get an arbitrarily large X factor improvement over your competition" or something like that.

(Always gotta love people comparing two completely different O(...) algorithms against each other and declaring one is X times faster than the other. This is another major source of "10,000x faster!"... yeah, O(n log n) is "10,000 faster!" than O(n^2), sure. It's also 100,000 times faster, 10 times faster, and a billionkajillion times faster, all at the same time.)