Hacker News new | ask | show | jobs
by nick238 810 days ago
Seems like the lede was buried in the article; I know a bit about QR codes: there's different modes for alphanum, binary, kanji, etc, and error correcting capacity...but being able to switch character sets in the middle was new to me.
1 comments

I am not entirely sure why you would want to switch encodings for URLs, personally. If you use alphanumeric encoding and a URL in Base36, you are pretty much information-theoretically optimal.
> you are pretty much information-theoretically optimal

base36 with alphanumeric mode encoding has around 6.38% overhead compared to base10's 0.34% overhead in numeric mode. So numeric mode gets you closer to optimal.

The issue is that QR's alphanumeric segments are uppercase only, and while browsers will automatically lowercase the protocol and domain name, you'll have to either have all your paths be uppercase or automatically lowercase paths. On top of that when someone scans the code it will likely be presented with an uppercase URL (if it doesn't automatically open in a browser) and that should alert anyone that doesn't already know that uppercase domains are equivalent to lowercase domains.

Ideally QR codes would have had a segment to encode URIs more efficiently (73-82 characters depending on how the implementation decided to handle the "unreserved marks"), but that ship has long sailed.

Many QR code readers will auto-lowercase URLs that are encoded in alphanumeric encoding. The rest will recognize uppercase URLs just fine. Alphanumeric encoding was basically made for URLs.
The QR alphanumeric input encoding does not include basic URL query string characters like '?' '&' '='
I've been putting URL in QR for like a decade, mixed case and query string included. How has it never been an issue?
Because you used bytes mode, not alphanumeric mode