| HN Mirror

I just took a shot at solving this puzzle for the /usr/share/dict wordlist (70k words when filtered to strictly alpha), using Python.

Assuming the "first build a graph" approach is the most efficient one, it's definitely taking me longer than 15 minutes to get a reasonably efficient solution. Just for the graph-building part. I'm taking a two-step approach, first connect the matching words that differ in length by one, so filtering for (a == b[1:] or a == b[:-1]), assuming a is always the shorter word of the pair. Checking all words of length N vs length N+1 that way took a few minutes on my (not very powerful) netbook. Not entirely happy with it, but no way I could optimize that further in 15 minutes anyway (if there is an even better algorithm), and that part of the calculation is done.

edit: just thought of a better way

    aa = sorted([words of length N])
    bb = [words of length N+1]

    # now make a sorted list of the longer words, truncated on either side,
    # (paired with the original for retrieval later)
    bb1 = sorted([(w[1:], w) for w in bb] 
        + [(w[:-1], w) for w in bb])

    # now we can do a merge operation on aa and bb1, and find 
    # the matches in O(len(aa)+len(bb1)) instead of O(len(aa)*len(bb))
    #
    # a merge operation that I can't be arsed to write out here, but I
    # promise it'll involve some super clever usage of the itertools module
    #

Now for step 2, words of equal length...