Hacker News new | ask | show | jobs
by mcintyre1994 4566 days ago
I know it's a bit off-topic, but I'm a student so anything about interviews interests me a lot. For the Bacon Number question, I'm struggling a bit to get my head around a good way to do it.

My immediate idea is to derive a graph where nodes are actors and an edge between 2 actors means they've been in 1 or more films together. I believe the shortest path between them and Bacon could then be found by breadth-first search.

It looks like that graph would be quite slow to derive though - my naive pseudocode being something like:

    for bucket in hashtable:
        for actor in bucket:
            if actor not in graph, add a node for actor
            for other_actor in bucket (up to actor):
               add an edge between other_actor and actor
That looks pretty slow without deriving a running time, so I'm guessing there's some optimisation that can be made from the current structure of the hashtable. Could anybody give me a pointer in the right direction please?
3 comments

It is the final inner loop that is problematic.

Instead, your graph should comprise movies and actors as vertices (edge = actor was in a given movie), and to get the Bacon number you just halve the distance in this graph.

So you code would look like

    for bucket in hashtable:
        add node for bucket
        for actor in bucket:
            if actor not in graph, add a node for actor
            add an edge between actor and bucket
            add an edge between bucket and actor
If each movie contains exactly K actors, and there are N movies, then your graph has O(NK^2) edges while mine has O(NK) edges.
Ah that's awesome, really smart optimisation, thanks. I hadn't really thought of the possibility of having the two different features in the same graph but that's definitely better!
A simple upgrade to your algorithm would be bidirectional search: https://en.wikipedia.org/wiki/Bidirectional_search

This image from Norvig's AI may help explain it better than the wiki page: http://www.massey.ac.nz/~a159302/lesson3/fig03_17.gif

That seems like an interesting alternative to BFS, thanks.
There's no need to construct the graph. The dictionary already is a graph.
I appreciate that, but I don't see a way to efficiently do a breadth first search or some similar search routine that will find the Bacon number on it. I mean, surely you need at some point to be able to follow edges from an actor - and that's really inefficient in this format?