| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by csjh 645 days ago
	I think the most surprising part here is the gzipped-base64'd-compressed data almost entirely removes the base64 overhead.

1 comments

zamadatix 645 days ago

It feels somewhat intuitive since (as the article notes) the Huffman encoding stage effectively "reverses" the original base64 overhead issue that an 8 bit (256 choices) index is used for 6 bits (64 choices) of actual characters. A useful compression algorithm which _didn't_ do this sort of thing would be very surprising as it would mean it doesn't notice simple patterns in the data but somehow compresses things anyways.

link

ranger_danger 645 days ago

how does it affect error correction though?

link

zamadatix 645 days ago

Neither base64 nor the gzipped version have error correction as implemented. The extra overhead bits in base64 come from selecting only a subset of printable characters, not by adding redundancy to the useful bits.

link