Hacker News new | ask | show | jobs
by vitus 544 days ago
In general, this isn't even a JS problem, or a JIT problem. You have similar issues even in a lower-level language like C++: branch prediction, cache warming, heck, even power state transitions if you're using AVX512 instructions on an older CPU. Stop-the-world GC causing pauses? Variation in memory management exists in C, too -- malloc and free are not fixed-cost, especially under high churn.

Benchmarks can be a useful tool, but they should not be mistaken for real-world performance.

2 comments

I think that sort of thing is a bit different because you can have more control over it. If you’re serious about benchmarking or running particularly performance sensitive code, you’ll be able to get reasonably consistent benchmark results run-to-run, and you’ll have a big checklist of things like hugepages, pgo, tickless mode, writing code a specific way, and so on to get good and consistent performance. I think you end up being less at the mercy of choices made by the VM.
The amount of control you have varies in a continuum between hand-written assembly to SQL queries. But there isn't really a difference of kind here, it's just a continuum.

If there's anything unique about Javascript is that has an unusually high rate of "unpredictability" / "abstraction level". But again, it has pretty normal values of both of those, just the relation is away from the norm.

I’m quite confused by your comment. I think this subthread is about reasons one might see variance on modern hardware in your ‘most control’ end of the continuum.

The start of the thread was about some ways where VM warmup (and benchmarking) may behave differently from how many reasonably experienced people expect.

I claim it’s reasonably different because one cannot tame parts of the VM warmup issues whereas one can tame many of the sources of variability one sees in high-performance systems outside of VMs, eg by cutting out the OS/scheduler, by disabling power saving and properly cooling the chip, by using CAT to limit interference with the L3 cache, and so on.

> eg by cutting out the OS/scheduler, by disabling power saving and properly cooling the chip, by using CAT to limit interference with the L3 cache, and so on

That's not very different from targeting a single JS interpreter.

In fact, you get much more predictability by targeting a single JS interpreter than by trying to guess your hardware limitations... Except, of course if you target a single hardware specification.

When we were upgrading to ES6 I was surprised/relieved to find that moving some leaf node code in the call graph to classes from prototypes did help. The common wisdom at the time was that classes were still relatively expensive. But they force strict, which we were using inconsistently, and they flatten the object representation (I discovered these issues in the heap dump, rather than the flame graph). Reducing memory pressure can overcome the cost of otherwise suboptimal code.

OpenTelemetry having not been invented yet, someone implemented server side HAR reports and the data collection on that was a substantial bottleneck. Particularly the original implementation.

Running benchmarks on the old Intel MacBook a previous job gave me was like pulling teeth. Thermal throttling all the time. Anything less than at least a 2x speed up was just noise and I’d have to push my changes to CI/CD to test, which is how our build process sprouted a benchmark pass. And a grafana dashboard showing the trend lines over time.

My wheelhouse is making lots of 4-15% improvements and laptops are no good for those.