Hacker News new | ask | show | jobs
by Acorn 5506 days ago
If you don't mind using - and _ then the implementation is very simple. I posted an example in Python and CoffeeScript to SO recently for doing this exact thing.

http://stackoverflow.com/questions/5940416/compress-a-series...

  >>> bin_string = '000111000010101010101000101001110'
  >>> encode(bin_string)
  'hcQOPgd'


  >>> decode(encode(bin_string))
  '000111000010101010101000101001110'
1 comments

FYI, it seems to be known as this: http://en.wikipedia.org/wiki/Base64#URL_applications
Hmm, looking at python's base64.urlsafe_b64encode(s), padding is still used. So it sounds like it doesn't fully implement the "modified Base64 for URL" spec.

Base64 encoding also seems to result in quite long ascii strings compared to what I threw together.

Or is there a way to use Base64 which would give results of a comparable length?

It's a lot longer because it doesn't know your have encoded your input data in, essentially base-1. If you to convert it to base-256 (that is, and array of bytes, instead or array of bits as you have now) it would produce the same length. Yes, there is base64 implementation in pretty much any language, though they usually use + and / for the two extra characters.

As wikipedia says, padding can be added or removed as a matter of taste: From a theoretical point of view the padding character is not needed, since the number of missing bytes can be calculated from the number of base64 digits.

http://en.wikipedia.org/wiki/Base64#Padding