Hacker News new | ask | show | jobs
by gliptic 719 days ago
I mean without storing the freq table, which the last example you linked doesn't. Having to store a frequency table limits the encoder to do no better than the 0-th order entropy. Arithmetic coding can do arbitrarily complex prediction per symbol, see e.g. http://mattmahoney.net/dc/text.html#1072 which reaches the world record by training a neural network to predict the next symbol while it's encoding a file.

Not to say a 0-th order entropy coder could not be useful for certain applications if it was e.g. faster than, say, a range coder while maintaining similar ratios. But I doubt that is possible with this method.

1 comments

I've added basic bit packing to the frequencies table, so long as the dictionary doesn't dominate like in a pangram it's slightly smaller than two arithmetic (static and adaptive) examples I could find online. I understand this isn't the best that arithmetic can do, next I'll look into that.

Never was trying to compare to NNs or PPM (or even simple LZ), since all this algorithm does is look at the frequency counts.

Agree it's not very practical and very expensive calculations, just fun to work on.