Hacker News new | ask | show | jobs
by Karellen 813 days ago
I'm not that familiar with QR codes. Anyone know how base16/hexadecimal encoding with 0-9A-F fares in comparison? It seems like an obvious encoding to test, especially for simplicity of implementation compared to base64 and base10, and an odd one to miss for comparison?
4 comments

The QR standard does not have a specific encoding mode [0] for hexademical, it would have to use alphanumeric. Since you'd only be using 16 out of 35 possible characters, it would be much less efficient.

0: https://en.wikipedia.org/wiki/QR_code#Information_capacity

Hex is worse, see here [1] for UUIDs

[1] https://news.ycombinator.com/item?id=39094251

I'm not confident of the math there.

https://i.imgur.com/cAVbqka.png

Because of quirks, in edge cases decimal is more efficient, but overall alphanumeric is better in QR code.

Ah, I was assuming numeric data rendered as hex, like UUIDs. Decimal works wonders for those, because the numeric mode of QR codes is the most efficient.
Ah, it is a good point that it might be worth comparing to, but it is far worse.

Abstractly, it requires approximately log(45)/log(16) output bits per input bit, an overhead of 37%.

Making this more concrete: each input byte is encoded as two hex digits, and two hex digits have to be encoded as two Alphanumeric characters. It thus takes 11 bits in the QR code bit stream to store 8 bits of input.

(I've added an analysis of this and other bases to the article: https://huonw.github.io/blog/2024/03/qr-base10-base64/#fn:ot...)
RFC 3986 says that * is a sub-delim. It cannot be assumed to be URI safe.

A base 38 alphabet is the maximal possible URI unreserved alphabet.

Most compact QR encoding capable of representing hex symbols is alphanumeric mode which requires 5.5 bits per character. Which means the output will be 5.5/4 = 1.375 times longer than encoded binary data or 37.5% overhead. That's even worse than 8/6 =1.33 you get for doing base64 encoding on top of byte mode.