| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tveita 3064 days ago

You don't need arbitrary precision, just enough precision. Specifically, in this case you only need 16 bits. You can start with an upper and lower bound, and narrow it down for each number.

Here's an example with numbers dumped from a quick prototype encoder. The exact range numbers aren't that important, but note how the range gets smaller for every symbol we encode:

Say we want to encode the ordered numbers 10, 15, 31, 31.

We start with the full range [0, 65536)

There are C(32 - 10 + 2, 3) = 2024 combinations that start with 10, out of 52360 possible, so that narrows it down to around 4% of the full range, we'll use [49702, 52236)

For the next number we only allow numbers 10 and above, 15 has 153 out of 2024 possibilities, range [51022, 51214)

For the next number, 31 is only 1 of the 153 possibilities, so it gets a tiny range, [51212, 51214).

And now 31 is the next number in 1 out of 1 of cases, so the range is again [51212, 51214)

We can code the sequence as either 51212 or 51213, since we used a range that is slightly larger than needed some combinations have multiple codes. We could have started with the smaller range [0, 52360) instead to get a bijection.