Hacker News new | ask | show | jobs
by Matetricks 3472 days ago
If I understand your code correctly in analyse_evaluations, you're defining a "surprising move" as a move that has a large change in valuation when it's considered at a higher depth. So if a "human" (really Stockfish at depth 5) evaluates a move as +1 and a "computer" (Stockfish now at depth 11) evaluates the move as +5, the move is surprising.

This is pretty interesting, but I'm not sure if it fully captures all the nuances of what a surprising move is. You might be able to classify a move as tactically surprising if it becomes clear after depth 7 that the ending position is favorable. However, in my opinion truly surprising moves are ones that carry plans that I haven't even considered. Hence, this methodology doesn't capture moves that are positionally surprising as there wouldn't be such a drastic change in evaluation at different depths. I'm not sure where you would start to figure that one out though :)

That being said this is really cool work!

3 comments

You can filter for this:

Take a database of games from pro players; Take the set of all moves where Stockfish-5 agrees that the move actually played is the optimum. Filter for all the moves where Stockfish-11 has a different opinion that results in a big gain in position. What you get is a list of moves that would surprise pro-players under time pressure.

I wouldn't be surprised if professional chess players are all running a version of this against individual known opponents before a tournament to probe for weaknesses.

A harder problem would be to cross-reference this final list with the post-game opinions published by professional commentators and identify major discrepancies. This would be the "wouldn't have thought of it in a million years" list.

The list that's returned still contains mainly tactical surprises where Stockfish inaccurately evaluated the position at the end of depth 5. I think what I'm trying to say is there are some moves in a position that aren't tactically surprising (a piece sacrifice, a crazy attacking move, etc.) but positionally surprising (a long maneuver to get a piece to a certain square that I didn't think of). These positionally surprising moves aren't captured by this methodology because they don't involve large fluctuations in valuation when the depth changes.

As to your second point, an issue with how computer chess affects the modern scene is how playing the "best" move in any given position isn't representative of how humans play. Humans carry out plans and evaluate positions to the best of their ability, but the heuristics and procedure they use aren't the same as a computer's. For example, Karjakin didn't prepare for his match against Carlsen last month by playing a bunch of games against Stockfish. Rather he probably analyzed Carlsen's past games and opening choices to come up with a strategy.

I do think you can come up with a way to prepare against individually known opponents by identifying weaknesses programmatically. You can model a human's approach to playing chess as a distribution of parameters (material, king safety, pawn structure, etc.) that take in the current position and return the best move. You also have Stockfish's evaluation which returns the "best" move. With this, it's possible that you could use build a neural network that learns to play very similarly to a certain player by using their past games as a training set and comparing the chosen move to Stockfish's move. The network could learn to mimic the heuristics that the human individual uses to make decisions and playing against this new AI would be great practice for preparing against specific opponents.

I'm not sure I follow your point about tactical vs positional surprise. Surely the ultimate goal of the positional surprise is the same as the tactical surprise - you get an advantage at the end of an expected series of moves. Otherwise what's the point of getting into a surprising position that's not better than the conventional one?

My question is, is there any difference here that can't be solved by, say, upping the ply-number?

On humanlike chess-AI: have an adversarial network that works to classify human vs machine players, and optimize for humanness * strength-of-play in the AI?

The difference is that the positional sacrifice is less tangible. A space advantage, a tempo advantage, more mobile pieces, improved cohesion/coordination of pieces (Kasparov was legendary for taking this last kind of advantage and turning it into a lethal attack). It's a dynamic advantage rather than a static/permanent advantage, which also means there's a risk of that advantage dissipating as the game drags on.

These advantages aren't the kind where you can sit back and let the game play out confident of winning. It's a deliberate unbalancing of the equilibrium of the position, and one where this temporary dynamic advantage needs to be used to create a longer-lasting and static advantage.

Would it be fair to say you are trying to optimize for future positions where you aren't sure you will win, but the positions resemble certain archetypal positions/ share certain features that are advantageous (i.e. has a high probability of transforming into conventionally advantageous situations)?

I'm sure the chess AIs are full of this sort of knowledge internally, though, in the form of computation optimization algorithms. Perhaps the issue is to translate it to a human-usable format.

Indeed, chess engines do have heuristics to include positional advantage in their evaluation of a board, so they "know" in some way that a doubled pawn is disadvantageous or that development of pieces or attacking central squares is beneficial, much as humans know these things.

I've never heard experts discuss this, but I bet it's true that human beings still succeed in appreciating many of these benefits at a higher level of abstraction than machines do. An argument for this is that computers needed an extremely large advantage in explicit search depth to be able to beat human grandmasters. So the humans had other kinds of advantages going for them and most likely still do. One of those advantages that seems plausible is more sophisticated evaluation of why a position is strong or weak, without explicit game tree searches.

I looked at the Stockfish code very briefly during TCEC and it looks like a number of the evaluation heuristics that are not based on material (captures) are manually coded based on human reasoning about chess positions. But if I understood correctly, they are also running machine learning with huge numbers of simulated games in order to empirically reweight these heuristics, so if a particular heuristic turns out to help win games, it can be assessed as more valid/higher priority.

You could imagine that there are some things that human players know tacitly or explicitly that Stockfish or other engines still have no representation of at all, and they might contribute quite a bit to the humans' strength.

Perhaps the positional sacrifice can be identified by similar means. The most superficial measurement of a position is the material left on the board. So when you compare the superficial measurement to a deeper positional measurement and they are divergent, then we have something positional.

I think one of Kasparov's games against Karpov in the New York portion of one of their World Championship matches involved Kasparov sacrificing a queen for positional compensation on the black side of a King's Indian. It would be interesting to see what this project thinks of that game.

This is pretty much what I wanted to say.

What is a surprising move varies greatly from player to machine. Here's a good example:

http://www.chessgames.com/perl/chessgame?gid=1064780

Capa's move 10 here (Bd7) is completely surprising to the vast majority of players and computers. It breaks most of the standard 'rules' of development and space control. However, it doesn't move the needle in terms of tactical significance at all. To me, that's a surprising move.

Surprising may be a bit subjective. I know that I am all too often surprised by my opponent - and not in a good way. This may be a good way to study what kind of patterns have interesting weaknesses.
Surprising is very subjective. I was playing chess in a cabin full of people and found the checkers game next to me to be more entertaining - most likely because of the people. My chess moves were not quite random, but because I wasn't really paying attention they were really frustrating my opponent because he couldn't make sense of what I was doing. My moves were either not very logical or so brilliant that he didn't know what I was up to and it was really getting into his head. Surprising moves? Yes. Good moves? Not so much.