|
|
|
|
|
by wpietri
1370 days ago
|
|
Having read through the (very good!) Wikipedia explanations a few times, I'm left with a different troubling intuition: if this works to improve compression, then I feel like there must be something wrong with the compression algorithms! |
|
The typical data compression algorithm starts with a combinatorial transformation of the data. Then it creates a statistical model of the transformed data and encodes it according to the model. There are two successful families of combinatorial transformations: Lempel–Ziv and Burrows–Wheeler. Neither of them is superior to the other, but the compression performance depends on the properties of the data.
Lempel–Ziv parsing replaces copies of repeated substrings with references to an earlier copy. It works best when there are long repeated substrings in the data. The modeling/encoding parts of an LZ compressor are usually pretty simple.
Burrows-Wheeler transform reorders the symbols according to the context they occur in. It's often but not always followed by run-length encoding. Because the symbols are reordered by context, BWT-based compressors work best when there are many copies of each repeated substring.
Unlike the LZ, the BWT does not compress the data on its own. It relies entirely on statistical compression. Because the symbols are ordered by context, the BWT is an order-n model for every n simultaneously. Hence you get something similar to PPM by simply partitioning the BWT into blocks and encoding each block independently with an order-0 encoder.