|
|
|
|
|
by dkbrk
1370 days ago
|
|
Honestly, the inverse Burrows-Wheeler transform seems like some sort of Voodoo black magic to me. It reminds be of the 100 prisoners problem [0]. Yes, I understand why it works. I can see how it works mathematically. But it still feels like it shouldn't work, that we're somehow getting something for free. [0]: https://en.wikipedia.org/wiki/100_prisoners_problem |
|
https://bugfix-66.com/7f0f425d3eee8def16e1102d054fd45394d027...
Wikipedia gives an impractical, inefficient account of the algorithm. It's almost comically bad.
But the BWT has an efficient inverse transform with very short, simple code, as seen above. This is the classic linked list algorithm.
The forward transform is also simple if done right. See the Go code here:
https://bugfix-66.com/72949624ebbb28b3d0ce5d700970a8857d354b...
Together, that's full code for industrial-strength forward and inverse BWT. All that's missing is some method of avoiding degeneracy in the forward transform string sort (e.g., noising long runs in the input or a suffix array).