Hacker News new | ask | show | jobs
by BlackAura 4869 days ago
Huffman coding alone wouldn't do a lot in that example. Huffman coding lets you build a prefix code to pack symbols into variable length sequences of bits, depending on the frequency of the symbol. A dictionary encoder, like LZ77, would replace repeated sequences by references, and that would make the two variable names equivalent. Deflate uses Huffman coding to pack the LZ77 compressed data using as few bits as possible, making it more efficient, but LZ77 is the part removing the redundancy.

I agree with the general sentiment though. Compression isn't magic, and understanding how it works helps you to work with compressed formats.

Writing your own implementation of some compression algorithms is a lot of fun to. I learned a lot from implementing Huffman coding and an LZ77 variant a few years ago.

1 comments

>Huffman coding alone wouldn't do a lot in that example. Huffman coding lets you build a prefix code to pack symbols into variable length sequences of bits, depending on the frequency of the symbol. A dictionary encoder, like LZ77, would replace repeated sequences by references, and that would make the two variable names equivalent. Deflate uses Huffman coding to pack the LZ77 compressed data using as few bits as possible, making it more efficient, but LZ77 is the part removing the redundancy.

That's how gzip works, but all you have to know that gzip WILL be packed via huffman coding in a compression algorithm, and you will be able to conclude you can have a far longer variable name length with no cost.

The actual algorithm itself is a far less common topic than huffman coding.