Hacker News new | ask | show | jobs
by oneoff786 1612 days ago
You basically described entropy maximization. But not quite. Eliminating words is just a piece of it, and the goal varies. are you trying to minimize the average number of guesses? Maximize the probability of getting it within N guesses? Etc.

Binary search isn’t the right answer here. You should be able to do far better than splitting things in half.

The greedy approach to solve this would be fairly simple. Find the next choice of n and letter that maximizes information gain recursively until the space is solved.

But a globally optimal approach would be pretty hard. At a minimum each letter has three color states so that’s 5^3 (125) groups of words that will be outputted by your result. The ideal first word, assuming your goal is to minimize average guesses, would try to get these groups as even as possible.

The theoretical perfect first word would thus get 96 possibilities (12000/125) left after a single guess but that’s probably not possible due to the grouping of words.

However a truly globally optimal setup only needs to guarantee that each outcome state has no more than 125 possibilities remaining, all of which are uniquely identifiable with a follow up word.

Damn, this sounds like a fun MIP challenge.

1 comments

The goal in wordle or hangman is to guess the word within a set number of guesses.

> minimize average number of guesses

Yes, and it’s def much better than binary search, but the algorithm to arrive at the solution is optimal in that way — minimize the depth of the tree given a set of words/letters, with each branch being “correct letter position”, “correct letter”, or eliminate letter. No need to use abstract coloring concepts either.

No real need for “ML” or anything.

If someone had the time, they can prebuilt all the decision trees and pick the optimal words to as the starting word.

Make a hangman solver was actually a take home interview question I had back in like 2009. Code it up, it’s a lot easier than it sounds.

I think you’re missing what I’m saying. I’m not talking about ML. I’m taking about mixed integer programming. Minimizing the depth of the trees is the goal, not a method of getting there. Most discussion I see on this is on greedy methods that choose the first step that will maximize information gain and then the next best and so forth. But it’s possible there exists a first step that is less optimal up front but allows for more entropy on the second steps. I’m not sure if the greedy solutions are able to guarantee a three step solution for all possible words. Even if they are, it’s probable that they’re still inferior to the globally optimized answer.