Hacker News new | ask | show | jobs
by ericyd 547 days ago
Maybe I'm doing it wrong, but when I benchmark code, my goal is to compare two implementations of the same function and see which is faster. This article seems to be concerned with finding some absolute metric of performance, but to me that isn't what benchmarking is for. Performance will vary based on hardware and runtime which often aren't in your control. The limitations described in this article are interesting notes, but I don't see how they would stop me from getting a reasonable assessment of which implementation is faster for a single benchmark.
5 comments

Well, the issue is that micro benchmarking in JS is borderline useless.

You can have some function that iterates over something and benchmark two different implementations and draw conclusions that one is better than the other.

Then, in real world, when it's in the context of some other code, you just can't draw conclusions because different engines will optimize the very same paths differently in different contexts.

Also, your micro benchmark may tell you that A is faster than B...when it's a hot function that has been optimized due to being used frequently. But then you find that B which is used only few times and doesn't get optimized will run faster by default.

It is really not easy nor obvious to benchmark different implementations. Let alone the fact that you have differences across engines, browsers, devices and OSs (which will use different OS calls and compiler behaviors).

I guess I've just never seen any alternative to microbenchmarking in the JS world. Do you know of any projects that do "macrobenchmarking" to a significant degree so I could see that approach?
Real world app and some other projects focus on entire app benchmarks.
The basic problem is that if the compiler handles your code in an unusual way in the benchmark, you haven't really measured the two implementations against each other, you've measured something different.

Dead code elimination is the most obvious way this happens, but you can also have issues where you give the branch predictor "help", or you can use a different number of implementations of a method so you get different inlining behavior (this can make a benchmark better or worse than reality), and many others.

As for runtime, if you're creating a library, you probably care at least a little bit about alternate runtimes, though you may well just target node/V8 (on the JVM, I've done limited benchmarking on runtimes other than HotSpot, though if any of my projects get more traction, I'd anticipate needing to do more).

It’s some of both because the same things that make benchmarking inaccurate make profiling less accurate. How do you know you should even be looking at that function in the first place?

You might get there eventually but a lot of people don’t persevere where perf is concerned. They give up early which leaves either a lot of perf on the table or room for peers to show them up.

You're not wrong, but there are cases where "absolute" performance matters. For example, when your app must meet a performance SLA.
Isn't that more profiling than benchmarking?
No? Profiling tells you which parts of your code take time.
Thanks