Hacker News new | ask | show | jobs
by mjevans 1825 days ago
32 bits raw <==> 33 bits (3 pairs of 11 bits) QR alphanum.

In blocks of 4 bytes this encodes as 6 'base45' (QR alphanum) characters, and uses the same lookup table.

https://en.wikipedia.org/wiki/QR_code#Encoding

The "Alphanumeric character codes" table, at least at a visual glance, is identical to the RFC's lookup table.

1 comments

The GP is saying that the equal efficiency is possible with using only 41 out of 45 characters, so reducing the symbol set would make base45 (now base41) more useful as a general encoding, not just an encoding for QR codes.
That's a fair point that wasn't made clearly enough. I have two questions though:

1) Is it worth no longer sharing the same lookup table? My supposition is that this wouldn't practically matter in the modern bloated environment anyway, but it's still annoying and might be a source of additional bugs.

2) Which 4 characters would be skipped for what reason?

I'd initially propose not including: non-printing character space (b45[36]), possible variable wildcard dollar-sign (also a currency symbol that might get translated) (b45[37]), html escape and database wildcard percent (b45[38]), then typical wildcard asterisk (star) (b45[39]). Thus the sequence string would be.

b41 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ+-./:";