Hacker News new | ask | show | jobs
by zamadatix 645 days ago
It feels somewhat intuitive since (as the article notes) the Huffman encoding stage effectively "reverses" the original base64 overhead issue that an 8 bit (256 choices) index is used for 6 bits (64 choices) of actual characters. A useful compression algorithm which _didn't_ do this sort of thing would be very surprising as it would mean it doesn't notice simple patterns in the data but somehow compresses things anyways.
1 comments

how does it affect error correction though?
Neither base64 nor the gzipped version have error correction as implemented. The extra overhead bits in base64 come from selecting only a subset of printable characters, not by adding redundancy to the useful bits.