Hacker News new | ask | show | jobs
by 6510 1003 days ago
How much better would that get if you append all but one of the equal size documents? (or other combinations like 2 of the top results after using a single one)
1 comments

Better, if the compressor can use all that extra context. Gzip, and most traditional general purpose compressors, can't.

It's hard to use distant context effectively. Even general purpose compression methods which theoretically can, often deliberately reset part of their context, since assuming a big file follows the same distribution throughout as in its beginning often hurts compression more than just starting over periodically.