Hacker News new | ask | show | jobs
by userbinator 1668 days ago
One of the mantras I used when I was a programming instructor was "How can you know what to tell the computer to do, if you can't do it yourself?" This is a great example of that.

Don’t forget that the huffman codes are packed LSB to MSB, but are to be interpreted as an integer in big endian format. Why this insanity?

Because that's what Phil Katz did. I vaguely remember reading about that years ago, and believe it had to do with how he implemented it; not as canonical table lookup, but as explicit tree-walking, since the former hadn't really been invented yet and the latter would fit within the memory constraints of a 16-bit system (DOS on a PC).

1 comments

Looks like the bytes shall not be reversed beforehand. Bits are easier to extract at the lower end. For example, let's take the first byte as an example, 01001011. The first field is one bit wide, indicating the final record of the stream, 01001011 & 1 = '1'. Remove the extracted bit by shifting, 0100101 remains. The next field is two bits wide, so extract those: 0100101 & 11 = '01' and remove the bits giving 01001. And so on...
Not so much. There are several degrees of freedom in this design, see ryg's Reading bits in far too many ways [1] for the introduction.

[1] https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far...