Hacker News new | ask | show | jobs
by scrash 246 days ago
The issues with branch prediction aren't really as much of a thing in modern interpreters, I can really recommend reading https://inria.hal.science/hal-01100647/document
1 comments

The paper is 10 years old. While the gap between a threaded an interpreter (a dispatch at the end of every handler) versus non-threaded (loop over switch) isn't as big as it used to be, it's still 15-30% on modern very fast interpreters. For example, I measured between 14 and 29% performance improvement for threading Wizard's interpreter[1].

[1] https://dl.acm.org/doi/10.1145/3563311

Interesting paper :) I've kept choosing threaded myself, but would have put the gap in a 5-10% range. I guess the branch predictor hasn't kept up. (Also trying to resist getting nerdsniped into measuring it myself 0_0)
Testing this in Wizard is fairly easy.

Compare the running speed of the two binaries built with different options:

    % V3C_OPTS=-redef-field=FastIntTuning.threadedDispatch=true ./build.sh wizeng x86-64-linux

    % bin/wizeng.x86-64-linux --mode=int test/microbench/100ms/fib.wasm


    % V3C_OPTS=-redef-field=FastIntTuning.threadedDispatch=false ./build.sh wizeng x86-64-linux

    % bin/wizeng.x86-64-linux --mode=int test/microbench/100ms/fib.wasm