Hacker News new | ask | show | jobs
by ignoramous 1596 days ago
Thanks.

> Why would you want to do that?

We store user preferences (200+ yes/no knobs) in a bitmap (well, a bitmap-index like the one in Hash Mapped-Array Tries). We want to capture those prefs in a single sub-domain (limited to 63 lower-case alphanumeric chars) or a URL (limited to 200 mixed-case alphanumerics). Today, we simply convert the bitmap into url-b64 (or, b32 to store it in the subdomain), but we will soon run out the 63-char limit if we introduce more knobs.

A demonstration of it is here, in case the above didn't explain it well: https://rethinkdns.com/configure (choose blocklists, and see the selection generate a path appended to the base-url shown in the search-bar).

2 comments

Oh. No, i don't think it's possible - i'd suggest to just use multiple subdomains.

Technically, you can squeeze out some bits: there are 36-37 possible characters of which you are using only 32, so with arithmetic coding you would be looking at about 1 extra bit for 5 characters, but it's a nightmare to code. And after those extra bits run out, you will get the same problem anyway.

Does another party need to decode the url? What about using a dictionary for the top 10k seen starting combinations and then encode the rest?

What about run length encoding? 1-9 for positive sequences. a-i for negative sequences (max means pattern continues) and the rest for frequent patterns like alternating sequences, etc

9967b would be 24 yes, 1 no, 7 yes, 3 nos, 1 yes etc

Thanks. RLE is a hit-and-a-miss (101010101010 etc); top-k is heuristic-based. Those are viable solutions, nevertheless (esp, top-k). Puny Code, which DNS uses, I thought was pretty neat for fitting in a state-machine in printable characters. Takes a bunch of CPU resources to restore it, though.

I was wondering if there are other techniques that I may not be aware of.