Hacker News new | ask | show | jobs
by paulddraper 911 days ago
> base36 textual contexts

Better IMO is base 32 with U (obscenity), 0/O (ambiguity), and I (ambiguity) removed.

6 comments

Removing characters for obscenity is pointless (thousands way to evade this "filter"), english-centric and honestly a weird idea.

I've always heard that the reason in another ambiguity (u/v) which makes more sense to me.

Base64/Base32/ASCII is English-centric.

Might be weird to you personally, but there's literally government agencies to prevent obscenities.

What makes the letter U obscene?
You can make the word fuck with it. That upsets children on the internet.
I doubt that upsets any children on the internet; more likely it upsets some adults on behalf of children on the internet.
If that's what you're trying to avoid, it will be a lot more effective to remove F.
Might as well go for Base27 then. Strip out all of the vowels and you can't accidentally make naughty words any more.
That's Crockford Base32, not RFC Base32

https://en.m.wikipedia.org/wiki/Base32

Crockford is a bit different, and normalizes I/1/O/0 on parsing.
Do we really expect humans to read baseX encodings directly to make it worth to have ambiguity checks?
Sometimes. Imagine if this is being used to generate something like a DOI or other catalog number for some data or physical artifact. As research scales up, the size of these identifiers also benefits from a more compact encoding.

These kinds of IDs might be printed in a research paper (perhaps in a figure caption or bibliography/reference entry). Then, someone might be reading this from a printed copy of the paper rather than a PDF with a link in it.

Or, researchers might be verbally referencing a particular item during some meeting. It might be recognizable among some peers actively working with the same artifacts, but might also need to be typed back into some search form to get back to online metadata etc.

Another place the same identifier might be is on a printed label for physical artifacts in an archive. Of course, you might also want something like a 2D barcode for scanning, but it is helpful to have something human readable.

Removing U just means your CD key begins with FCKGW
So.. crockford32 mentioned in the article?