Hacker News new | ask | show | jobs
by s3th 3750 days ago
It's not too late! The wasm binary encoding is open to change up until the browsers ship a stable MVP implementation (then the plan is to freeze the encoding indefinitely at version 1).

The primary advantage of LEB128 is (as you mentioned) that it's a relatively common encoding. PrefixVarint is not an open source encoding IIUC.

We'll do some experiments in terms of speed. If the gains are significant we may be able to adopt something similar (this [0] looks like a related idea).

Thanks for the suggestion.

[0]: http://www.dlugosz.com/ZIP2/VLI.html

2 comments

PrefixVarint isn't open-source, but the encoding is trivial.

PrefixVarints are a folk theorem of Computer Science, (re-)invented in many times and places.

I actually coded it up once in Python and once in C before joining Google, and was chatting with an engineer, complaining about the Protocol Buffer varint encoding. The person I was complaining to, said "Yea, Doug Rhode did exactly that, called it PrefixVarint. He benchmarked it much faster."

See my other comments on this thread for a simple implementation of a bijective big-endian prefix varint encoder. You may or may not want a bijective encoding, and probably want little-endian. I'm just used to writing big-endian encoders (for lexographical sorting reasons), so that was faster for me to whip up a demonstration of a bijective encoder.

A real implementation would use a switch statement instead of a loop. One might use a lookup table or a few instructions of inline assembly to calculate the number of leading ones in the first byte, and switch on that.