| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SeveredCross 5669 days ago
	I found the two-queue technique to be much more intuitive than the binary heap technique, which definitely resulted in it being more fun to code. As an aside, the two-queue technique is also more performant, as you can build the tree in O (n) time instead of O (n log n) time.

1 comments

eru 5669 days ago

You still have to sort to build the queues. So you won't get below O(n log n) for the whole algorithm.

If you squint just about right, you can probably see the equivalence between sorting plus queues and the tree based techniques. Especially if you use a tree-structured sorting algorithm.

link

SeveredCross 5669 days ago

Indeed, though as the OP points out just below, if you have fixed frequencies, you don't have to sort.

That said, you generally have fixed frequencies with a predetermined set of constraints on the data, and you can determine the most common character frequency (or it's already known). I'm not sure how common this is in real-world applications of Huffman coding.

Seeing as you should only be building the tree once though, you're right--you'd be hard-pressed to tell the difference between the two approaches, barring implementation mistakes. Designing an implementation mistake that would cause noticeable differences is an exercise left to the reader.

link

pohl 5669 days ago

In my particular case, the character frequencies are (more or less) fixed and the sorting could be done ahead-of-time before feeding the leaf nodes into the first queue. So I really did get O(n) at instantiation time. However, my n is so small that it didn't matter much.

link

eru 5669 days ago

You can also get fix your whole encoding then.

link

pohl 5669 days ago

You can if it's absolutely fixed. But what about when it's only "more-or-less-fixed"? :-)

link

eru 5669 days ago

Then you need the sorting, again. You can't have your cake and eat it, too.

link

pohl 5669 days ago

Being able to relegate the cost of sorting to be a one-time cost on a server and allowing clients to build the tree they need for decoding in O(n) allows the client to [something about cake].

link

eru 5669 days ago

Oops, horrible typo there.

link