Hacker News new | ask | show | jobs
by zem 4913 days ago
as a competitive scrabble player, one mistake that I think he might have made is overweighting corpus-based probability versus game playability. his transition-in and transition-out weights are a good start, but there's also the fact that n+1 letter words that can be made by "hooking" (adding a single letter before or after) n letter words are far more useful than those that cannot. also the layout of the premium squares, and the letter distribution of the bag, factor into how playable certain tiles are. intuitively, at the least I'd expect one more point for the U, and for the V to catch up with the Z, though of course it's very easy to fool yourself about these things when using strategies based on the current letter values.
2 comments

Maybe you and the OP should petition the Words With Friends people to give you a statistically significant data-dump of real games so you can analyze and revalue the letters.

It would be a service to us all.

That would be really interesting data to look at.

There would be a lot of factors to consider that might make it hard to apply to re-valuing Scrabble, though:

1. Most Words With Friends play is casual, and casual players are likely to play very differently than competitive players.

2. The data would show you what words players play given the current scoring system, and wouldn't necessarily translate to another scoring system.

3. Words With Friends is a different game than Scrabble (different board layout, different bingo scoring, different number of tiles and different tile frequency).

4. Players end up playing the majority of the tiles they draw, so the frequency with which they play a letter may have a closer relationship to how many of the printed tiles have that letter than it is to the frequency of the letter in the corpus.

Regarding 1: perhaps one could segment the players based on their scores --- the people I know who play scrabble competitively generally have a significantly higher average word score.
True - in competitive scrabble luck actually plays a pretty small role. It's much closer to chess than it is to poker.
on the granularity of an individual game there's a surprising amount of luck even in the top levels of the game. for instance, i'm at best a high B-division player, but i've won tournament games against two world champions (and, on the flip side, lost to players rated far lower than me). it's the overall performance in an (ideally 15-game+) tournament that lets the best players consistently rise to the top.
The luck element is a result of the malapportionment that the OP mentions (it's good to receive the overvalued tiles, and who receives them is down to luck).
Quite frankly, the words with friends board layout and tileset was so badly balanced that after then first couple of weeks I could no longer stand to play it. I think a better route would be to have quackle play itself millions of times and get data from there (it's a reasonably common thing for people developing scrabble heuristics to do nowadays).
I've come across words in the OSPD that Words with Friends didn't take, as well as words not in the OSPD that it did take. A casual Google suggests that Words with Friends uses a modified version of a dictionary called ENABLE[1], so while it would be interesting data to see, the dictionary difference is a little troublesome for conclusions pertaining to Scrabble.

(You can, of course, play Scrabble with any dictionary but most serious folk I know use OSPD. I'd never even heard of ENABLE until Googling this question.)

[1]: http://blogmybrain.com/words-with-friends-cheat/words.txt

Interesting, hooking could argue for weighting letters at the beginning or end of 3+-letter words higher than those in the middle.
it's pretty easy to just explicitly say "if word is hook able" and "if word is a hook", since you have the whole dictionary to hand