Hacker News new | ask | show | jobs
by AnotherGoodName 1897 days ago
One thing I'd encourage looking at is dynamic Markov coding. It's easy enough to implement and gets you to 1/5th size for text compression. Still not at the ~6x ratio of the current best (paq8) but it's close. There's no dictionary involved. As you encode or decode you update the probabilities and build the dictionary on the fly and encode with arithmetic coding.
2 comments

Prediction by partial match (PPM) is also very good. The "D" version, which comes within 7zip, gives very good compression.

The longer the text and the more "common", the better.

Yes I think Markov coding also Markov chains is such good things to learn. But in this article my point is there should be stable dictionary because for big data purposes, especially in databases it is hard to calculate every time new probs.