Hacker News new | ask | show | jobs
by glinscott 2758 days ago
If you are interested in compression at all, be sure to take a trip through Charles Bloom's blog [1]. It's an incredible read, he covers everything from the basics all the way through state of the art algorithms.

A great example is this post [2], where he talks about how to correctly implement a Huffman encoder/decoder. It's a lot tricker than it is made to sound in most books. For example, most Huffman codes that are used in practice are length limited, to allow the decoder to use smaller lookup tables. There are a bunch of surprisingly interesting tricks to get that to work well from the encoding side (which symbols do you choose to be smaller than they would be otherwise?).

[1] http://cbloomrants.blogspot.com/ [2] http://cbloomrants.blogspot.com/2010/08/08-12-10-lost-huffma...

4 comments

I like Charles Bloom's blog, but I'm not sure if it's very approachable (after all, it's "rants" :-). If you want more diversity in the readings, ryg blog [1] makes good reading. Start with the most recent series on efficiently reading bits [2].

[1] https://fgiesen.wordpress.com/category/compression/

[2] https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far...

Zstandard uses both Huffman and ANS coding: https://en.wikipedia.org/wiki/Asymmetric_numeral_systems
Thank you for these pointers. Posts like this are why I come to HN :)
What I would really love to know is that how they get those results with their newest set of algorithm. They seem to beat every compression algorithm including zstd and lz4 in both compression and speed. And how we can all benefit from those improvements.