| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wenc 1629 days ago
	What’s your optimal 5-letter starting word? Mine is HOUSE. It has 3 vowels (including E which is the most frequent in the English language) and S which helps test for plurals.

19 comments

ineptech 1629 days ago

I always begin with PENIS. Maybe not optimal statistically, but emotionally.

link

Syzygies 1629 days ago

In the 1960's my dad programmed Jotto (a simpler five letter secret word game) on Kodak's computers. Entropy became the first interesting mathematical concept that I learned.

Log base 2 of the remaining words is a measure of how many yes/no questions it would take to identify the word. An entropy strategy looks for a clue word that minimizes the expected value of this measure. One optimizes sum p log p over the pile sizes.

Pure mathematicians prefer certain concepts with a religious fervor. Often this has been informed by a reasonable number of problems where a concept has been proven optimal. The best applied mathematicians understand pure math but prefer practical work. To a pure mathematician, the rest are just guessing.

Here, one needs a clearly stated objective function for measuring success. Entropy strategies are often optimal for simple objective functions.

A critical detail for this game: The secret words come from a shorter list than the valid guess words. One wants a guess word that best partitions the shorter list of secret word candidates, not the full list of valid guess words.

link

sbarre 1629 days ago

Hah funny to see other people strategizing like this..

I did some research and (from what I found) the most common letters in English are:

E, O, T, A, I, R, N, H, S (not in any order)

So I came up with TRAIN and SHOVE as 2 starting words that use all those letters without repeats, plus V.

WORLD is also a good one because it uses one of the most common starting letters (W, T, A, O, D) and one of the most common ending letters (E, D, S, T).

Some other good starting words for me have been RIVET, CANDY, PLUCK and BASTE.

link

ryzvonusef 1629 days ago

Thanks for the hint, I used PLUCK for wordle 202 and got a very good start

link

GrantZvolsky 1629 days ago

I used ORATE which was a hilariously bad guess, but surprisingly effective at pruning the search space in later rounds:

Wordle 202 3/6

  00000
  0?0??
  11111

link

labster 1629 days ago

No kidding, PLUCK has LUCK in it today. I went with PIANO first because I like to constrain the vowels, and got it in three.

link

iams 1629 days ago

The optimal starting word is ARISE, as it partitions the possible words most evenly across the different green/yellow/grey colour combinations

link

lalaithion 1629 days ago

ARISE is good, but AROSE is better, and SOARE is the best: On average it eliminates all but 2264 out of ~12000 5 letter words.

https://github.com/lalaithion/wordle

link

npinsker 1629 days ago

I wrote a brute-force minimax solver (minimizing expected guesses) which tracks with this information:

  SOARE 3.45
  RAISE 3.46
  ARISE 3.47
  SERAI 3.52

Most 'reasonable' words seem within 0.1 or so of the optimal strategy. I think the second word is likely far more important than the first.

link

fsckboy 1623 days ago

but soare isn't a word

link

iams 1629 days ago

It depends on your word list, are you using the same word list as wordle?

link

lalaithion 1629 days ago

I do not. According to the original article, Wordle uses[1] 2,500 common words out of the 12,000 5-letter words in the english language[2]. I use the 5 letter words in the collins scrabble dictionary (which is about 12,000 words).

The assumption you need to make for my analysis to be correct is that the letter patterns in the 2,500 possible answers is statistically similar to the distribution of letter patterns in the original 12,000. There are probably some differences between the distributions, and I'd love to rerun my code with the actual word list Wordle uses, but in the absence of that list, I think that my code does about as good as possible.

[1] uses for the answers; I assume it allows all 12,000 for guesses. [2] NYTimes does not specify which source they used

link

Syzygies 1629 days ago

Read the JavaScript? It contains both lists. Training on the wrong dictionary, tomorrow you might find yourself in a slump.

link

xPaw 1629 days ago

The word list is in Wordle code, so you can just grab that.

link

kelseyfrog 1629 days ago

The answers are also in the code which opens the door to speedruns.

link

bradleybuda 1629 days ago

I like TRAIL - no "E", but the next two big vowels and TRL are very good consonants

link

Sohcahtoa82 1629 days ago

What a coincidence. I played this for the first time today and started with TRIAL.

link

noxvilleza 1629 days ago

I did some analysis on this last week [https://noxville.medium.com/raising-the-wordle-first-guess-b...], ROATE left the lowest average valid answers (60.424) but 195 in the worst case. RAISE was the second best average (61.0) but a much better worst case of 168 words. Both are just heuristics, there might be a better word than either if the game were solved.

link

sbarre 1629 days ago

I wonder if ROATE is even in the list of words. I'd never heard it and I'd like to think I have a decent vocabulary.

The article says they whittled it down from 12,000 words to around 2,500 words, aiming for words that most(?) people would be familiar with.

link

noxvilleza 1629 days ago

Yes, it's in the list of guessable words - however it's not a valid answer word. You need to use the WORDLE word list in order to evaluate how good or bad your initial guess is, using another word list will provide distorted results.

link

sbarre 1629 days ago

Yes that's what I meant, is it valid in Wordle specifically.

link

noxvilleza 1629 days ago

That's what I was clarifying with 'Yes, it's in the list of guessable words - however it's not a valid answer word'.

They have two lists of words: one is a list of possible answers, and another is a list of extra (valid) words which can be guessed (in addition to the words in the answer list). Sometimes it might be better to use a non-answer word as a guess: the best case gets worse (since you cannot win immediately) but the average case and worst case both get better.

link

sbarre 1628 days ago

Ah I understand, thank you for clarifying!

link

evan_ 1629 days ago

I use "RENTS" which puts the S at the end as a quick test for plurals.

link

lapetitejort 1629 days ago

I've been using that as well, just one letter off of the classic R S T L N E

link

lalaithion 1629 days ago

I coded some heuristics and ran them on a Scrabble dictionary: https://github.com/lalaithion/wordle

link

iams 1629 days ago

With the first word you aren't necessarily trying to get the most number of matches.

You are trying to use the word that once you get the match result back, it discards the most number of words.

These two are not the same thing.

link

noxvilleza 1629 days ago

> You are trying to use the word that once you get the match result back, it discards the most number of words.

This isn't strictly true either. Two N-sized subsets of words from a common initial set might have completely different difficulty in reducing further, because in the worst case there might not be a valid guess which nicely spreads the remaining words out among the 243 possible outcomes for that guess.

Set-size is a good heuristic, but it's just that - a heuristic.

link

lalaithion 1629 days ago

Yes, as you'll see in the readme that I have numbers for both approaches. The first section is the average number of remaining words after getting the match result back, and the second section is the average number of yellow and green squares.

link

periodontal 1629 days ago

You might also try maximum instead of average. This is minimax and represents worst case scenarios for each guess.

This is mostly useful for optimal play against an opponent (which is not the case here). Imagine an adversarial version where the opponent doesn't have to commit to a word at the beginning but must reveal one matching all clues if you can't get it in 6 guesses (basically, they can change their word when you guess and you are trying to make that impossible).

link

gojomo 1629 days ago

As a handicap, I'm now beginning each new day with the prior day's answer word.

link

mmastrac 1629 days ago

I start with ADIEU, personally.

link

rusbus 1629 days ago

the optimal 5 letter starting work is SERAI (as computed by someone's AI)

link

MerelyMortal 1629 days ago

There was HN post a few weeks ago to a blog where someone computed SOARE as the optimal word.

Correction: a month ago - https://news.ycombinator.com/item?id=29439191

link

jgrahamc 1629 days ago

https://twitter.com/jgrahamc/status/1479189616846639110

link

slazaro 1629 days ago

I coded a small program, and the sequence of words that I use (not interchangeable, you'd use them in sequence to gather more info) is:

AROSE UNTIL DUCHY BLIMP GAWKS

link

ddoeth 1629 days ago

I also really like JUMPY and INTER

link

lostinquebec 1629 days ago

STEAR - not technically a word, but it is accepted, and hits the most common vowels and consonants.

link

Ichthypresbyter 1629 days ago

I think it's an archaic spelling of the verb "steer". I remember the edition of Treasure Island I had as a boy has Long John Silver use it in the line "We can stear a course, but who's to set one?" when they are discussing whether to mutiny.

Though when I google that phrase, Silver uses the modern spelling.

link

defect0 1629 days ago

I've been using TEARS by same logic.

link

detritus 1629 days ago

Mine was ouija, but house is a lot smarter, d'oh!

link

random314 1629 days ago

SAUCE for me

link

nvr219 1629 days ago