| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AnotherGoodName 1897 days ago
	One thing I'd encourage looking at is dynamic Markov coding. It's easy enough to implement and gets you to 1/5th size for text compression. Still not at the ~6x ratio of the current best (paq8) but it's close. There's no dictionary involved. As you encode or decode you update the probabilities and build the dictionary on the fly and encode with arithmetic coding.

2 comments

woliveirajr 1897 days ago

Prediction by partial match (PPM) is also very good. The "D" version, which comes within 7zip, gives very good compression.

The longer the text and the more "common", the better.

link

kronxe 1896 days ago

Yes I think Markov coding also Markov chains is such good things to learn. But in this article my point is there should be stable dictionary because for big data purposes, especially in databases it is hard to calculate every time new probs.

link