Hacker News new | ask | show | jobs
by javajosh 4913 days ago
Maybe you and the OP should petition the Words With Friends people to give you a statistically significant data-dump of real games so you can analyze and revalue the letters.

It would be a service to us all.

3 comments

That would be really interesting data to look at.

There would be a lot of factors to consider that might make it hard to apply to re-valuing Scrabble, though:

1. Most Words With Friends play is casual, and casual players are likely to play very differently than competitive players.

2. The data would show you what words players play given the current scoring system, and wouldn't necessarily translate to another scoring system.

3. Words With Friends is a different game than Scrabble (different board layout, different bingo scoring, different number of tiles and different tile frequency).

4. Players end up playing the majority of the tiles they draw, so the frequency with which they play a letter may have a closer relationship to how many of the printed tiles have that letter than it is to the frequency of the letter in the corpus.

Regarding 1: perhaps one could segment the players based on their scores --- the people I know who play scrabble competitively generally have a significantly higher average word score.
True - in competitive scrabble luck actually plays a pretty small role. It's much closer to chess than it is to poker.
on the granularity of an individual game there's a surprising amount of luck even in the top levels of the game. for instance, i'm at best a high B-division player, but i've won tournament games against two world champions (and, on the flip side, lost to players rated far lower than me). it's the overall performance in an (ideally 15-game+) tournament that lets the best players consistently rise to the top.
The luck element is a result of the malapportionment that the OP mentions (it's good to receive the overvalued tiles, and who receives them is down to luck).
that's part of it, but not nearly as large a part as you'd imagine. there are ways to defend against, e.g., someone getting the X both ways on a triple letter score (the most common large "tile lottery" moment). harder to overcome is someone simply drawing one bingo after another (possibly by being lucky with the blanks and Ss), getting an early 100-200 point lead, and then simply closing the board down (both players have low-scoring moves thereafter, but you already have the lead), or having a close-fought game be irretrievably lost because you get a final rack with six vowels, or none, or an unplayable Q that hits you with a 20 point penalty and let's your opponent play his final rack out letter by letter, for a large number of points.
Quite frankly, the words with friends board layout and tileset was so badly balanced that after then first couple of weeks I could no longer stand to play it. I think a better route would be to have quackle play itself millions of times and get data from there (it's a reasonably common thing for people developing scrabble heuristics to do nowadays).
I've come across words in the OSPD that Words with Friends didn't take, as well as words not in the OSPD that it did take. A casual Google suggests that Words with Friends uses a modified version of a dictionary called ENABLE[1], so while it would be interesting data to see, the dictionary difference is a little troublesome for conclusions pertaining to Scrabble.

(You can, of course, play Scrabble with any dictionary but most serious folk I know use OSPD. I'd never even heard of ENABLE until Googling this question.)

[1]: http://blogmybrain.com/words-with-friends-cheat/words.txt