Hacker News new | ask | show | jobs
by Jach 2706 days ago
> the question is simply: here's a data structure, we need to traverse it in a certain order

Because BFS/DFS are so easy to interchange, I'd strip ordering out entirely: here's a collection of stuff where each element may also point to other stuff in the collection, starting with this element search through the stuff to see if you can find a certain element.

If I was going to ask this problem, I'd help a nervous-looking candidate by contrasting the problem (or leading with) a simpler still one: here's a collection of stuff where each element can point to at most one other thing, starting with this element see if you can find a certain one in the collection.

Linear search is, technically, an algorithm. Graph search just extends it. All these people railing against algorithms in interview problems as non-applicable surely have at least written a for loop over an array and had an if statement somewhere inside, right? Linear search. When I've given interviews I don't try to make them based on "algorithmic knowledge"[0] but at the same time I really don't like the trend of thinking of algorithms in general as "oh, that's someone else's job, I never use those" or "I can only implement algorithms by memorizing them."

[0] When I was just starting to get asked to give some I lazily asked a colleague for a reference problem, they sent me https://leetcode.com/problems/jump-game/description/ and I read the problem, immediately recognized that the general approach of graph searching would solve it[1], and coded up the 12-15 lines or so it took, made some tests, fixed some minor mistakes... Total start-finish time was 10-15 minutes, not the fastest. There's also a solution to that problem that doesn't require thinking of it as a graph, too, but I think it's harder to get the insight. Anyway, nice problem, I thought, I'll only give a problem if the candidate has 2x the time it takes me to solve it as a buffer for interview nervousness/whatever, but after giving it to a few people only one of them managed to solve it in like 50 minutes after creating a pretty verbose and pseudo-coded BFS system. (My initial approach used DFS, but it doesn't matter here -- though for the jump game 2 sequel problem which asks to find the shortest number of jumps, BFS makes sense, and it's easy to take the solution for the first problem and make minor alterations. General algorithms like "search" are great and widely applicable.)

[1] In retrospect, before I saw the problem I was writing a simple game of go client as a side project, and had recently implemented a function to, given a point on the board, determine if there's a stone there and if so determine if it's part of a larger group of stones and return their coordinates. So in some sense the problem was already 'primed', and I've written a lot of graph search skeletons for both fun and profit. It's interesting to hear that some people go decades without doing so...

1 comments

This problem can be solved with my favorite algorithm in the whole world, union-find [1], yaaay! :-p

It is hard to say if U/F would be more efficient for this problem. U/F can perform two `union(a, b)` and `find(a, b)` operations in O(lg n) runtime complexity, or, with some more effort (path compression), nearly constant time (a, b representing node indexes). But for this problem a first pass over all indexes would be required to build the disjoint set from the input.

I think for small inputs DFS or BFS would be more efficient, and have the plus of not needing extras storage (U/F would require a second array the same size than the input). For large arrays, probably U/F would win since the input array may encode potentially a lot of edges (say, if each index contains a number large enough) and DFS runtime complexity is O(Edges + Nodes).

Anyway, I think U/F can be really useful in the context of job interviews, there are many problems that can be reduced to connected components, after some massaging.

Here's an implementation I did a while ago [2]. Even though I love the algo, I don't really remember the details about path compression... time to refresh my memory I guess :-D

[1]: https://algs4.cs.princeton.edu/15uf/

[2]: https://gist.github.com/EmmanuelOga/bcafad14715a3f584e97

Neat structure! I consulted my copy of Skiena's Algorithm Design Manual. He says: "Union-find is a fast, simple data structure that every programmer should know about." Well... Now I do. :) It is simple, a 'backwards' tree with pointers from a node to its parent, which lets you union two separate trees together by just taking the shorter one's root and pointing it at the taller one's (or vice versa, but this way preserves log behavior). The path compression optimization seems to just be an extra pass in find(), after you have the result, to re-parent each item along the path to the found root parent so that any future finds() of any of those items will only have one lookup to reach the component root.

For the jump game problem, I'd expect DFS would still win almost all the time even when there are many branches because it doesn't have to explore every element of the input array except in the worst case (and you can greedily explore a neighbor without having to discover your other neighbors first), whereas to build a complete union-find structure would. Still, the fastest solution in general is probably just the single pass: if you have the insight that you can keep track of a 'max reachable index' and update it at each step to max(current-max, i + A[i]), bailing with failure if you hit an index > the current max and having no more jumps, or bailing with success if your max becomes >= the final index. (I never had this insight myself.) It could be slower of course, e.g. if the array was something like [bigNum/2, lots of 0s, bigNum/2 again in the middle, lots of 0s, end at index bigNum].