Hacker News new | ask | show | jobs
by pchm 908 days ago
You make some good points. What I was trying to say is that even though there is the RFC, it's quite common to modify the alphabet or use other variants like Crockford's (mainly to avoid random profanity, e.g. in the URL identifiers).

When you see a Base64 string, you can be pretty certain that it's the standard version. With Base32, it's not obvious which variant was used.

Many languages don't provide a stdlib Base32 implementation (Ruby doesn't), but Base64 is pretty much always included. Maybe this influenced my perception of the lack of a universal standard.

Anyway, I should work on that section to communicate my point better.

2 comments

I believe the technical term is “Schelling point”: something that people can decide on without communication.

Base64 is very close to the Schelling point of Base62 i.e. [A-Za-z0-9], requiring only a couple more additional decisions to be made: which two extra characters to add.

Unfortunately the original Base64 inexplicably got this wrong and chose + and / instead of the more sensible choice of - and _

In some cases (luck of the data, but often when encoding ASCII without padding) you won't see the non alphanumeric characters (62nd and 63rd place) in Base64 either. So you can't always tell the difference between Base64, Base64Url, Xxencode, or B64.

"Hello, world!" = `SGVsbG8sIHdvcmxkIQ` (base64, base64url), `BG4JgP4wg65RjQalY6E` (Xxencode), or `G4JgP4wg65RjQalY6E` (b64). A legitimate reason for choosing B64 over Base64 would be: it maintains ASCII sort-order.

Any language that has to deal with HTTP (or MIME) has to encode/decode Base64 in order to support some headers (eg Basic auth) and features (binary data from a form submission). There is no similar HTTP need for Base32, so perhaps it's less surprising it's not in the standard library?