Hacker News new | ask | show | jobs
by pygy_ 544 days ago
I have been sleeping on this for quite a while (long covid is a bitch), but I have built a benchmarking lib that sidesteps quite a few of these problems, by

- running the benchmark in thin slices, interspersed and suffled, rather than in one big batch per item (which also avoids having one scenario penalized by transient noise)

- displaying a graphs that show possible multi-modal distributions when the JIT gets in the way

- varying the lengths of the thin slices between run to work around the poor timer resolution in browsers

- assigning the results of the benchmark to a global (or a variable in the parent scope as it is in the WEB demo below) avoid dead code elimination

This isn't a panacea, but it is better than the existing solutions AFA I'm aware.

There are still issues because, sometimes, even if the task order is shuffled for each slice, the literal source order can influence how/if a bit of code is compiled, resulting in unreliable results. The "thin slice" approach can also dilute the GC runtime between scenarios if the amount of garbage isn't identical between scenarios.

I think it is, however, a step in the right direction.

- CLI runner for NODE: https://github.com/pygy/bunchmark.js/tree/main/packages/cli

- WIP WEB UI: https://flems.io/https://gist.github.com/pygy/3de7a5193989e0...

In both case, if you've used JSPerf you should feel right at home in the WEB UI. The CLI UI is meant to replicate the WEB UI as close as possible (see the example file).

1 comments

I hadn't run these in a while, but in the current Chrome version, you can clearly see the multi-modality of the results with the dummy Math.random() benchmark.