Hacker News new | ask | show | jobs
by Radim 5605 days ago
I guess it depends on what type of queries you expect; if you want to find the same (fixed) substring across a body of (dynamic) texts, the O(n) cost of preprocessing (suffix trees/arrays) is terrible.

If, on the other hand, you have a fixed "corpus" and a dynamic query, O(n) search time (this algo, purportedly) is terrible.

1 comments

Hmm...there's no mention of it being suited for a dynamic or "online" setting as another commenter notes (not sure why my other comment was downvoted for that). I don't know what to make of this from the description:

"This algorithm especially well suited for long S, multi-substrings, small alphabet, regex/parsing and search for common substrings."

Long source text: the O(n times m) worst-case time per search kills it. Even for a single search, O(n times m) worst case here versus O(n + m) for suffix trees.

multi-substrings: suffix trees do each substring of length m in O(m), but are not compared.

search for common substrings: again, suffix trees would be more appropriate.