Hacker News new | ask | show | jobs
by ChuckMcM 3492 days ago
I was going to bring up base 85 as well, its a better choice for a variety of reasons. A long time ago I wrote a base encoder class in Java[1] mostly so that we could write a netnews reader in Java but also because I felt UUEncoding was not robust. The challenges of using unprintable characters is a lot more of a headache than anyone pays attention to initially. Lots (and I mean quite a few here) of systems consider unprintable characters "safe" to re-purpose into random uses. One display vendor had them changing the color of future characters in the display as an example.

Stick with the characters that nearly everyone assumes could legitimately come up in a document and your chances of running afoul of some "creative genius" who decided "Hey its unprintable so no one will try to print it, but when I do print it I want this thing to happen..."

[1] http://grepcode.com/file/repository.grepcode.com/java/root/j...

1 comments

I wrote a base 92 encoder for the Javascript game I'm working on:

http://www.emergencevector.com/

It's pretty easy to write the decode for the 0-91 integer in Javascript.

    if (ch == "!") {
        return 57;
    } else {
        return ch.charCodeAt(0) - 35;
    }
It doesn't give you that much usable compactness over base 64, though you can easily encode a 360 degree angle with two bits of precision lost. Also, 5 base 92 characters can fully encode 32 bits of binary data. (Of course, since base 85 can do it in 5 characters.)

I'm probably going to go to typed arrays of 32 bit values. Currently, I can encode an entire ship's data in 18 bytes, of which 4 characters is a hash id.