| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by btilly 5469 days ago

It does help you. You're following a dynamic programming solution. Suppose that the maximum dictionary word is of length k. Then for each position in the string you have a maximum of k previous positions that you're tracking for, "We could start a next word here."

Once you've scanned the string, only then do you discover whether the whole string is segmentable.

Put another way, while you're processing you don't immediately know whether or not the whole string is segmentable. But having a trie can let you discard possibilities early. Discarding work early means doing less work means being more efficient.

1 comments

kragen 5469 days ago

It's true that the trie allows you to prune the search tree, but I don't think that gets you to O(N). The maximum-dictionary-word-length check does get you to O(N), though, or rather O(kN) if you consider the dictionary as part of the input.

link

btilly 5469 days ago

The trie check contains the maximum-dictionary-word-length check as an obvious special case and therefore its worst case is the same order of magnitude efficiency as the other check.

You do have the cost of descending a level in the trie. But that can be made constant (with a jump table) or else the log of the size of the alphabet (which is a constant for all intents and purposes).

link

kragen 5469 days ago

I suppose that depends on what you store in your trie. If every node contains a field for the length of the longest path below it, then yes. But that's not a normal thing to store in a trie, and it wasn't at all obvious to me that that was what you meant.

link

btilly 5469 days ago

Not true.

If you keep on following the trie, you'll fall out of the trie at the point that there are no words that have that sequence at the start. Which will happen by the time you exceed the longest word in the dictionary, but will probably happen substantially before that.

link

dtunkelang 5467 days ago

Am curious to see having the dictionary in a trie allows you to build an implementation that is O(n). Will add it to the blog post if it works.

link