| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Karellen 813 days ago
	I'm not that familiar with QR codes. Anyone know how base16/hexadecimal encoding with 0-9A-F fares in comparison? It seems like an obvious encoding to test, especially for simplicity of implementation compared to base64 and base10, and an odd one to miss for comparison?

4 comments

pimlottc 813 days ago

The QR standard does not have a specific encoding mode [0] for hexademical, it would have to use alphanumeric. Since you'd only be using 16 out of 35 possible characters, it would be much less efficient.

0: https://en.wikipedia.org/wiki/QR_code#Information_capacity

link

komlan 813 days ago

Hex is worse, see here [1] for UUIDs

[1] https://news.ycombinator.com/item?id=39094251

link

Zamicol 812 days ago

I'm not confident of the math there.

https://i.imgur.com/cAVbqka.png

Because of quirks, in edge cases decimal is more efficient, but overall alphanumeric is better in QR code.

link

komlan 812 days ago

Ah, I was assuming numeric data rendered as hex, like UUIDs. Decimal works wonders for those, because the numeric mode of QR codes is the most efficient.

link

dbaupp 813 days ago

Ah, it is a good point that it might be worth comparing to, but it is far worse.

Abstractly, it requires approximately log(45)/log(16) output bits per input bit, an overhead of 37%.

Making this more concrete: each input byte is encoded as two hex digits, and two hex digits have to be encoded as two Alphanumeric characters. It thus takes 11 bits in the QR code bit stream to store 8 bits of input.

link

dbaupp 813 days ago

(I've added an analysis of this and other bases to the article: https://huonw.github.io/blog/2024/03/qr-base10-base64/#fn:ot...)

link

Zamicol 812 days ago

RFC 3986 says that * is a sub-delim. It cannot be assumed to be URI safe.

A base 38 alphabet is the maximal possible URI unreserved alphabet.

link

Karliss 813 days ago

Most compact QR encoding capable of representing hex symbols is alphanumeric mode which requires 5.5 bits per character. Which means the output will be 5.5/4 = 1.375 times longer than encoded binary data or 37.5% overhead. That's even worse than 8/6 =1.33 you get for doing base64 encoding on top of byte mode.

link