There’s only ~12k-13k five letter English words in the dictionary. Check /usr/share/dict/words file (if you use Windows, go find a dictionary file elsewhere).
The first word should be a word that roughly eliminates half the words, regardless of getting any letters right or wrong.
Count the letters in each position and calculate & rank the words that closely eliminates 1/2 the whole list. Use that as a filter and binary search until there’s only 1 word left.
Wordle is the same game as mastermind and hangman and wheel of fortune.
I’d argue Aeros is not the optimal first word in order to filter and binary search the list of five letter words. Homework to be left to the wordler.
You basically described entropy maximization. But not quite. Eliminating words is just a piece of it, and the goal varies. are you trying to minimize the average number of guesses? Maximize the probability of getting it within N guesses? Etc.
Binary search isn’t the right answer here. You should be able to do far better than splitting things in half.
The greedy approach to solve this would be fairly simple. Find the next choice of n and letter that maximizes information gain recursively until the space is solved.
But a globally optimal approach would be pretty hard. At a minimum each letter has three color states so that’s 5^3 (125) groups of words that will be outputted by your result. The ideal first word, assuming your goal is to minimize average guesses, would try to get these groups as even as possible.
The theoretical perfect first word would thus get 96 possibilities (12000/125) left after a single guess but that’s probably not possible due to the grouping of words.
However a truly globally optimal setup only needs to guarantee that each outcome state has no more than 125 possibilities remaining, all of which are uniquely identifiable with a follow up word.
The goal in wordle or hangman is to guess the word within a set number of guesses.
> minimize average number of guesses
Yes, and it’s def much better than binary search, but the algorithm to arrive at the solution is optimal in that way — minimize the depth of the tree given a set of words/letters, with each branch being “correct letter position”, “correct letter”, or eliminate letter. No need to use abstract coloring concepts either.
No real need for “ML” or anything.
If someone had the time, they can prebuilt all the decision trees and pick the optimal words to as the starting word.
Make a hangman solver was actually a take home interview question I had back in like 2009. Code it up, it’s a lot easier than it sounds.
I think you’re missing what I’m saying. I’m not talking about ML. I’m taking about mixed integer programming. Minimizing the depth of the trees is the goal, not a method of getting there. Most discussion I see on this is on greedy methods that choose the first step that will maximize information gain and then the next best and so forth. But it’s possible there exists a first step that is less optimal up front but allows for more entropy on the second steps. I’m not sure if the greedy solutions are able to guarantee a three step solution for all possible words. Even if they are, it’s probable that they’re still inferior to the globally optimized answer.
There’s only ~12k-13k five letter English words in the dictionary. Check /usr/share/dict/words file (if you use Windows, go find a dictionary file elsewhere).
The first word should be a word that roughly eliminates half the words, regardless of getting any letters right or wrong.
Count the letters in each position and calculate & rank the words that closely eliminates 1/2 the whole list. Use that as a filter and binary search until there’s only 1 word left.
Wordle is the same game as mastermind and hangman and wheel of fortune.
I’d argue Aeros is not the optimal first word in order to filter and binary search the list of five letter words. Homework to be left to the wordler.