Hacker News new | ask | show | jobs
by btilly 5422 days ago
It does help you. You're following a dynamic programming solution. Suppose that the maximum dictionary word is of length k. Then for each position in the string you have a maximum of k previous positions that you're tracking for, "We could start a next word here."

Once you've scanned the string, only then do you discover whether the whole string is segmentable.

Put another way, while you're processing you don't immediately know whether or not the whole string is segmentable. But having a trie can let you discard possibilities early. Discarding work early means doing less work means being more efficient.

1 comments

It's true that the trie allows you to prune the search tree, but I don't think that gets you to O(N). The maximum-dictionary-word-length check does get you to O(N), though, or rather O(kN) if you consider the dictionary as part of the input.
The trie check contains the maximum-dictionary-word-length check as an obvious special case and therefore its worst case is the same order of magnitude efficiency as the other check.

You do have the cost of descending a level in the trie. But that can be made constant (with a jump table) or else the log of the size of the alphabet (which is a constant for all intents and purposes).

I suppose that depends on what you store in your trie. If every node contains a field for the length of the longest path below it, then yes. But that's not a normal thing to store in a trie, and it wasn't at all obvious to me that that was what you meant.
Not true.

If you keep on following the trie, you'll fall out of the trie at the point that there are no words that have that sequence at the start. Which will happen by the time you exceed the longest word in the dictionary, but will probably happen substantially before that.

Am curious to see having the dictionary in a trie allows you to build an implementation that is O(n). Will add it to the blog post if it works.