Hacker News new | ask | show | jobs
by wenc 1629 days ago
What’s your optimal 5-letter starting word?

Mine is HOUSE. It has 3 vowels (including E which is the most frequent in the English language) and S which helps test for plurals.

19 comments

I always begin with PENIS. Maybe not optimal statistically, but emotionally.
In the 1960's my dad programmed Jotto (a simpler five letter secret word game) on Kodak's computers. Entropy became the first interesting mathematical concept that I learned.

Log base 2 of the remaining words is a measure of how many yes/no questions it would take to identify the word. An entropy strategy looks for a clue word that minimizes the expected value of this measure. One optimizes sum p log p over the pile sizes.

Pure mathematicians prefer certain concepts with a religious fervor. Often this has been informed by a reasonable number of problems where a concept has been proven optimal. The best applied mathematicians understand pure math but prefer practical work. To a pure mathematician, the rest are just guessing.

Here, one needs a clearly stated objective function for measuring success. Entropy strategies are often optimal for simple objective functions.

A critical detail for this game: The secret words come from a shorter list than the valid guess words. One wants a guess word that best partitions the shorter list of secret word candidates, not the full list of valid guess words.

Hah funny to see other people strategizing like this..

I did some research and (from what I found) the most common letters in English are:

E, O, T, A, I, R, N, H, S (not in any order)

So I came up with TRAIN and SHOVE as 2 starting words that use all those letters without repeats, plus V.

WORLD is also a good one because it uses one of the most common starting letters (W, T, A, O, D) and one of the most common ending letters (E, D, S, T).

Some other good starting words for me have been RIVET, CANDY, PLUCK and BASTE.

Thanks for the hint, I used PLUCK for wordle 202 and got a very good start
I used ORATE which was a hilariously bad guess, but surprisingly effective at pruning the search space in later rounds:

Wordle 202 3/6

  00000
  0?0??
  11111
No kidding, PLUCK has LUCK in it today. I went with PIANO first because I like to constrain the vowels, and got it in three.
The optimal starting word is ARISE, as it partitions the possible words most evenly across the different green/yellow/grey colour combinations
ARISE is good, but AROSE is better, and SOARE is the best: On average it eliminates all but 2264 out of ~12000 5 letter words.

https://github.com/lalaithion/wordle

I wrote a brute-force minimax solver (minimizing expected guesses) which tracks with this information:

  SOARE 3.45
  RAISE 3.46
  ARISE 3.47
  SERAI 3.52
Most 'reasonable' words seem within 0.1 or so of the optimal strategy. I think the second word is likely far more important than the first.
but soare isn't a word
It depends on your word list, are you using the same word list as wordle?
I do not. According to the original article, Wordle uses[1] 2,500 common words out of the 12,000 5-letter words in the english language[2]. I use the 5 letter words in the collins scrabble dictionary (which is about 12,000 words).

The assumption you need to make for my analysis to be correct is that the letter patterns in the 2,500 possible answers is statistically similar to the distribution of letter patterns in the original 12,000. There are probably some differences between the distributions, and I'd love to rerun my code with the actual word list Wordle uses, but in the absence of that list, I think that my code does about as good as possible.

[1] uses for the answers; I assume it allows all 12,000 for guesses. [2] NYTimes does not specify which source they used

Read the JavaScript? It contains both lists. Training on the wrong dictionary, tomorrow you might find yourself in a slump.
The word list is in Wordle code, so you can just grab that.
The answers are also in the code which opens the door to speedruns.
I like TRAIL - no "E", but the next two big vowels and TRL are very good consonants
What a coincidence. I played this for the first time today and started with TRIAL.
I did some analysis on this last week [https://noxville.medium.com/raising-the-wordle-first-guess-b...], ROATE left the lowest average valid answers (60.424) but 195 in the worst case. RAISE was the second best average (61.0) but a much better worst case of 168 words. Both are just heuristics, there might be a better word than either if the game were solved.
I wonder if ROATE is even in the list of words. I'd never heard it and I'd like to think I have a decent vocabulary.

The article says they whittled it down from 12,000 words to around 2,500 words, aiming for words that most(?) people would be familiar with.

Yes, it's in the list of guessable words - however it's not a valid answer word. You need to use the WORDLE word list in order to evaluate how good or bad your initial guess is, using another word list will provide distorted results.
Yes that's what I meant, is it valid in Wordle specifically.
That's what I was clarifying with 'Yes, it's in the list of guessable words - however it's not a valid answer word'.

They have two lists of words: one is a list of possible answers, and another is a list of extra (valid) words which can be guessed (in addition to the words in the answer list). Sometimes it might be better to use a non-answer word as a guess: the best case gets worse (since you cannot win immediately) but the average case and worst case both get better.

Ah I understand, thank you for clarifying!
I use "RENTS" which puts the S at the end as a quick test for plurals.
I've been using that as well, just one letter off of the classic R S T L N E
I coded some heuristics and ran them on a Scrabble dictionary: https://github.com/lalaithion/wordle
With the first word you aren't necessarily trying to get the most number of matches.

You are trying to use the word that once you get the match result back, it discards the most number of words.

These two are not the same thing.

> You are trying to use the word that once you get the match result back, it discards the most number of words.

This isn't strictly true either. Two N-sized subsets of words from a common initial set might have completely different difficulty in reducing further, because in the worst case there might not be a valid guess which nicely spreads the remaining words out among the 243 possible outcomes for that guess.

Set-size is a good heuristic, but it's just that - a heuristic.

Yes, as you'll see in the readme that I have numbers for both approaches. The first section is the average number of remaining words after getting the match result back, and the second section is the average number of yellow and green squares.
You might also try maximum instead of average. This is minimax and represents worst case scenarios for each guess.

This is mostly useful for optimal play against an opponent (which is not the case here). Imagine an adversarial version where the opponent doesn't have to commit to a word at the beginning but must reveal one matching all clues if you can't get it in 6 guesses (basically, they can change their word when you guess and you are trying to make that impossible).

As a handicap, I'm now beginning each new day with the prior day's answer word.
I start with ADIEU, personally.
the optimal 5 letter starting work is SERAI (as computed by someone's AI)
There was HN post a few weeks ago to a blog where someone computed SOARE as the optimal word.

Correction: a month ago - https://news.ycombinator.com/item?id=29439191

I coded a small program, and the sequence of words that I use (not interchangeable, you'd use them in sequence to gather more info) is:

AROSE UNTIL DUCHY BLIMP GAWKS

I also really like JUMPY and INTER
STEAR - not technically a word, but it is accepted, and hits the most common vowels and consonants.
I think it's an archaic spelling of the verb "steer". I remember the edition of Treasure Island I had as a boy has Long John Silver use it in the line "We can stear a course, but who's to set one?" when they are discussing whether to mutiny.

Though when I google that phrase, Silver uses the modern spelling.

I've been using TEARS by same logic.
Mine was ouija, but house is a lot smarter, d'oh!
SAUCE for me
I always start with UPDOG.
What’s updog?
NOT MUCH WHAT'S UP WITH YOU

thanks

I always start with HAOLE
Last couple times I started with the word which rhymes with "tennis".
The only word I can think of that rhymes with "tennis" is "menace". That's 6 letters. But I think I know what you meant.
What accent do you have that those two words rhyme?