Hacker News new | ask | show | jobs
by weitzj 2421 days ago
Also interesting to get into are suffix arrays which you could construct from a tree. Then there is a notion of generalized suffix arrays, where you mix multiple suffix arrays into a big suffix array. Then you can give fast answers like: what is the longest common prefix between the several tokens, whereas the tokens of your individual suffix arrays could be letters from an alphabet (so you construct a suffix array from words) or maybe words (you construct a suffix array from a sentence)
1 comments

And then you could get into compressed indexes such as the FM-index, which represent a text in a form that is both (typically) smaller than the original, and also supports fast substring queries.

Suffix trees are cool, but they’re just the entry point to a world of wonderful data structures and algorithms.

The Burrows-Wheeler transform was an interesting thing to see! When I used to teach at an immersive 3 month program for web dev we had a few weeks of playing with such arrays, trees, tries, dynamic programming, and text from Wikipedia.