Humbling way to start the work week. I could produce the fizzbuzz solution and in my sharper days the recursive backtracing one, but definitely no further.
I wouldn't feel bad. The advanced answers to this question require spending a lot of time understanding string processing. It's like having a CSS question that can be implemented multiple ways: a simple, obvious, slow way and a complicated, "deep knowledge required", fast way. If you have lots of experience with CSS, you might get the fast way, but it doesn't really say how good a programmer you are.
(Yes, not a perfect analogy, but it hopefully gives the idea.)
> The advanced answers to this question require spending a lot of time understanding string processing.
No, the advanced answers are a simple application of dynamic programming. If you've never heard of dynamic programming before, you're unlikely to invent it in response to an interview question, of course; but if you have heard of it, it might occur to you to try it on this problem.
(Actually, if you've heard of memoization but not dynamic programming, you might invent dynamic programming in response to this question.)
I think this is at the opposite end of the spectrum from your CSS example. Dynamic programming has nothing to do with string processing or with any other particular domain. There's a list of 29 significant algorithms that apply it at http://en.wikipedia.org/wiki/Dynamic_programming#Algorithms_.... It might qualify as "deep knowledge", but it's not deep domain knowledge; it's the kind of deep knowledge that would make you want to hire someone from a different domain.
Counterpoint: @Retric congratulations, you’ve reinvented dynamic programming with your O(n) solution That’d be a perfect solution.
And yes, I had zero idea what memorization or dynamic programming meant in that context. After looking it up on wikipedia it seems to mean caching intermediate steps to avoid recalculating them which seems obvious enough.
Indeed. The only detail that surprised me was that you could only do exact lookups in the dictionary. That makes it O(n*n). If you had the dictionary stored in a trie it would be O(n) on long strings. (With a maximum constant whose size depends on the length of the longest word in the dictionary.)
You could store the dictionary in a trie in O(m) time before you start. But I don't think it's true that it makes it O(n); being able to tell that the prefix you're considering is the prefix of some word doesn't help you, because you still don't know if the suffix of the string after that word is segmentable. (Or even whether the string contains the entire word.)
It does help you. You're following a dynamic programming solution. Suppose that the maximum dictionary word is of length k. Then for each position in the string you have a maximum of k previous positions that you're tracking for, "We could start a next word here."
Once you've scanned the string, only then do you discover whether the whole string is segmentable.
Put another way, while you're processing you don't immediately know whether or not the whole string is segmentable. But having a trie can let you discard possibilities early. Discarding work early means doing less work means being more efficient.
It's true that the trie allows you to prune the search tree, but I don't think that gets you to O(N). The maximum-dictionary-word-length check does get you to O(N), though, or rather O(kN) if you consider the dictionary as part of the input.
(Yes, not a perfect analogy, but it hopefully gives the idea.)