Hacker News new | ask | show | jobs
by franknstein 1400 days ago
Magnus Carlsen: 2864 Elo

Stockfish: 3585 Elo

"Elo suggested scaling ratings so that a difference of 200 rating points in chess would mean that the stronger player has an expected score (which basically is an expected average score) of approximately 0.75, and the USCF initially aimed for an average club player to have a rating of 1500."

I guess that means that Magnus has expected score of roughly 0.25^((3585 - 2864)/200) = 0.00675 against Stockfish 15, which is basically 1 in 200 games?

5 comments

That is not quite the right calculation. To see this, try plugging 0.75 into the same formula to get Stockfish's expected score. The result is about 0.3545. If this were the correct formula, then the two expected scores should sum to 1, but in this case we only get 0.3612.

Instead, you should convert the 0.25 to "odds" form. 0.25 is 1:3 odds, represented by the number 1/3. (1/3)^((3585 - 2864)/200) is about 0.01905 (still in odds form). To convert this back to an expected score you would take 0.01905 / (1 + 0.01905) = 0.0187. So Magnus Carlsen's expected score is 0.0187.

Applying the same method to Stockfish, we have 3:1 odds, which is represented by the number 3. 3^((3585 - 2864)/200) is about 52.48. Converting back to expected score we get 52.48 / (1 + 52.48) = 0.9813. So Stockfish's expected score is 0.9813.

Our sanity check is to add 0.0187 + 0.9813. The result is 1.0, as it should be.

ELO is dependent upon the pool of players that one plays in. Engine ELO's have no relationship to human ratings because no humans play computers under normal conditions. For an example of this phenomena taken to extremes, Claude Bloodgood [1] was a strong amateur who ended up as officially rated as one of the top players in the world (and #2 in the US) simply because he was only playing against a pool of other prison inmates who were in turn playing only against each other. So all his rating reflected was his relative strength in the prison pool.

Computers are definitely much stronger than humans, but not 3600 better. Magnus would certainly be able to eek out plenty of draws, if not only because white can create "simplified" (as a euphemism for dead) positions in just about any variation if he really wants. And Magnus regularly plays these sort of positions literally at the level of supercomputers.

I'd also add that much of the dominance of computers is not based just on raw ability alone, but more psychological issues. Humans can become tilted, intimidated, frustrated, tired, and so on. One of the last major human vs computer events was Kramnik vs Fritz. Kramnik, in a relatively simple position, ended up blundering mate in 1 with plenty of time on his clock. It's unlikely he would have ever made the same mistake against a human. It's just very difficult to get in the same mindset when playing against a human as when playing against a computer. Chess, in spite of being a game of complete information, is still extremely influenced by psychology.

[1] - https://en.wikipedia.org/wiki/Claude_Bloodgood

Winning chance is basically 0.0% against current strong engines for a human. There are possible draw lines but if engine is configured correctly draw chance is also 0.0%
As some other commenter calculated Stockfish wins in about ~>98% of the cases. The other ~<2% of the cases aren't Magnus winning, rather them drawing. I think no GM is known to beat a modern competitive AI in chess, however there are known/recorded instances of draws.
Machine and human ELO ratings are decorrelated. No games are played between a full strength engine and a human anymore. Even Magnus can’t win against a top engine. He might draw if he is lucky.
Stockfish competes in fide competitions?