| HN Mirror

I haven't read this paper, and haven't worked with the compressed variety of suffix trees/arrays. That said, I'm confident it's possible to retrieve the original file from a normal suffix tree, although it would be pretty slow. I imagine it must be possible to retrieve the file from the compressed version as well.

If nothing else, I think it is possible to retrieve a complete file from any index that lets you search for substrings and get a full string and position in the file back. Just search for all length 1 substrings, get a map from position -> string, then reconstruct the file from that.

I doubt it's worth storing files this way, because turning the index back into the file sounds very slow. I'd rather just store the file and the index, the time v. space tradeoff seems like a good one if you really care about search performance. That said, I use The Silver Searcher, and it's fast enough with no index that I don't think any of this stuff is worth the effort for searching text files on a file system.