Hacker News new | ask | show | jobs
by j-pb 2422 days ago
A suffix tree contains a compact trie like graph representation of all the suffixes of a long string. So if you want to know if it contains a substring, you can start at the root and just search like in a prefix tree. Since for the queried word to be an infix it must be a prefix of a suffix, you will find the word in O(|query|) time.

So yeah its a prefix tree, but a prefix tree of all the suffixes.

And now to answer your question. If you naively put every suffix into a regular prefix tree, you would get the same lookup performance. BUT, since the suffixes will share a lot of suffixes, which the prefix tree can't compress you have a lot more memory consumption.

The suffix tree avoids this by essentially performing compression on the suffix as well, so that you simply have pointers to shared suffixes of suffixes, which point to the root.

In the end it's like a diamond shape I guess, trie in the front, compression in the back.

1 comments

Why can't one construct a prefix tree of all prefixes, then start process from the right to left (assume that it is an offline algorithm)?
Because you can walk search trees from their root but not from their leafs.