Hacker News new | ask | show | jobs
by hgibbs 1606 days ago
As a point of curiosity, what do you mean by optimal?

To me, I can see at least two (potentially different) cost functions that a Wordle strategy can aim to minimise. Firstly, you can try to minimise the expected number of guesses, and secondly you can try to minimise the expected number of guesses conditional on never losing the game. In principle you could optimise for the first but not the second by allowing a small list of hard words to take > 6 guesses on average.

Part of my point is the lack of an absolute definition of optimal: strategies are only optimal up to the specified coat function.

4 comments

IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses. Beyond that, you can have different optimization functions based on whether you're trying to minimize the worst-case depth of the decision tree (number of guesses) for each position, or the average depth, or some other reasonable function of the distribution over number of guesses.

Although you could have arbitrarily many cost functions, a fairly general class is the following: pick some nonnegative constants (w1, w2, w3, w4, w5, w6, w7), and for a certain strategy, define the cost as:

    w1*n1 + w2*n2 + w3*n3 + w4*n4 + w5*n5 + w6*n6 + w7*n7
where nk (for 1≤k≤6) is the number of hidden (solution) words for which the strategy takes n guesses, and n7 is the number of words for which the strategy loses the game. Then,

• Setting w7 infinite / very high is a way of encoding the condition that you not lose the game. (You can set it finite/small if you don't mind occasionally losing the game for some reason!)

• Setting w1 = 1, w2 = 2, …, w6 = 6 will simply minimize the average number of guesses,

• Having them grow exponentially, e.g. setting wk = c^(k-1), where c is some constant larger than 12972 (the number of acceptable guess words) will make sure that the minimal-cost strategy first minimizes the maximum number of guesses, then break ties by having the fewest words take that many guesses, and so on.

> IMO, never losing the game is the bare minimum for any reasonable cost function: never take > 6 guesses.

I wouldn’t think so. If I get 99.9% of rounds correct with 3 guesses (and fail to find a solution to 0.1% of the rounds), and you get 100% of rounds with 6 guesses, I’d say I’ve soundly defeated you 99.9% of the time.

Right, that point of view makes sense if you imagine two players separately playing Wordle and comparing their results, and indeed that's what the OP may have been thinking. But as far as I know, Wordle is primarily a game for one player to play against the computer, where eventually getting to the word (or not) is the most prominent outcome (the only notion of "win" or "defeat" in the game itself), and solving it in fewer guesses is just a bonus. But sure, if you don't care about occasionally losing, you can just set an appropriate weight for w7 within the same family of cost functions.
Indeed, it’s essentially a single player game, but we’re talking about which bot a player would want to use which is inherently a notion of bots competing against each other.
If you look at the failed attempts in the article, you'll see it's not so easy to get 100% correct with 6 guesses. For example:

SLATE, CRONY, HOUND, BOUND, POUND, MOUND (correct: WOUND)

SLATE, SHARE, SHAME, SHAPE, SHAKE, SHADE (correct: SHAVE)

...so sometimes even if it gets 4 of 5 letters right on the second try, it still has too many options left. Of course in cases like this it could go "one step back" and try a word with (most of) the letters that are different between these words (HBPMW or MPKDV). So try different words until the number of options is narrowed down sufficiently.

> I’d say I’ve soundly defeated you 99.9% of the time.

Except it wouldn't be 99.9% of the time. Just the games were you guessed fewer than GP.

GP's solver would expect to guess the correct word 100% of the time in fewer than 6 guess, but not that it would always take 6 guesses.

I was saying if you hypothetically got every correct guess in exactly 6 tries. You would pass the suggested “bare minimum” while my 99.9% strategy doesn’t, yet I claim my strategy is better.
For some value of better, sure.
Sure, where "some value of better" is "wins in the overwhelming majority of rounds."
> never take > 6 guesses

I suspect that with most languages it’s impossible to achieve 100% win rate for five or more letters. So w7 should be by far the largest weight but I don’t think it can be infinite.

I suspect the opposite. While the theoretical number of combinations is huge, the amount of information given by each guess grows faster than the number of real words of longer lengths.
They specified that in the post, they were able to constrain to a strategy that guarantees 6 or fewer and then optimize for EV to get 3.55, but could get down to 3.52 by allowing some small number of words to take more than 6.

I would be interested to know if 6 is the lowest you can constrain it. Can you guarantee a solution in 5, and if so what is the best EV with that constraint?

Yes, you can -- I believe the EV was around 3.47 for that. The "greedy" strategy (minimizing the worst-case size of the largest set) also ends up using at most 5 words.

Perhaps unsurprisingly, you can't guarantee at most 4 words.

You can get EV of 3.46393 with a guaranteed maximum of five tries. (I don't know whether this bound is tight.)
Good to know.

Another possible question is: What is the best strategy minimizing (at any point) _first_ the maximum number of tries, and second the EV (average number of tries over all possible words)? That's subtly different from either answer.

Funnily enough, I think neither of those notions are the correct thing to optimize. For me, you want to minimize the maximum number of guesses. This leads to an equilibrium. Otherwise, your opponent can just abuse your strategy.
My point is exactly that though - the choice of cost function is subjective. There is no best cost function, just a mapping from costs functions to optimal strategies.
The (exact) optimal EV-minimizing strategies (assuming each target word is equally likely) always solve in 5 of fewer guesses in normal mode, and always solve in 6 guesses in hard mode. If you wish to guarantee 5 guesses in hard mode, there's a different (optimal) strategy with slightly higher EV. See: http://sonorouschocolate.com/notes/index.php?title=The_best_...

I don't think a maximum-number-of-guesses optimizer leads to an equilibrium (if you mean Nash equilibrium), assuming you're playing the game where the score of the game for the setter is the number of guesses. In particular, if the solver's strategy is deterministic, the setter will always pick one of the words that needs 5 guesses. There's no reason to think the nash equilibrium can be easily found.

Adversarial strategies are another type of problem entirely (as in, if you're playing Word Mastermind). Wordle is a 1-player game, so I find hgibbs's comment totally reasonable. I would think that 'optimal' is more accurately his second definition - minimizing avg guesses but always winning (as in, losing is +inf guesses).
Isn't that just #2, but changing the definition of losing to be the least number of rounds where you can always win (which for Wordle is 5)? These small changes to the definition don't seem to change ev (or the code needed to generate it) that much.

If you don't care about optimizing the EV at all, the greedy tree already has the property you're interested in.

What do you mean by opponent?

The entity choosing the word in a game like Wordle or Hang Man is more like the dungeon master in a game of D&D than an opponent in the traditional sense? They are there to help you have fun.

If two players are competing in a single round of wordle, then surely the player who guesses the correct word in the fewer number of guesses wins, right? Then it seems natural to say that the better of two players is the one with the most wins after playing all possible rounds.

The only problem is that I think this definition leads to the possibility of non-transitivity, where A beats B, B beats C, and C beats A.