Hacker News new | ask | show | jobs
by glinscott 2731 days ago
James put together a really nice summary of the ideas and the projects!

It was almost a year ago that lc0 was launched, since then the community (led by Alexander Lyashuk, author of the current engine) has taken it to a totally different level. Follow along at http://lczero.org!

Gcp has also done an amazing job with Leela Zero, with a very active community on the Go side. http://zero.sjeng.org

Of course, DeepMind really did something amazing with AlphaZero. It’s hard to overstate how dominant minimax search has been in chess. For another approach (MCTS/NN) to even be competitive with 50+ years of research is amazing. And all that without any human knowledge!

Still, Stockfish keeps on improving - Stockfish 10 is significantly stronger than the version AlphaZero played in the paper (no fault of DeepMind; SF just improves quickly). We need a public exhibition match to setttle the score, ideally with some GM commentary :). To complete the links you can watch Stockfish improve here: http://tests.stockfishchess.org.

2 comments

I thought alphazero was based on minimax, and used neural networks to evaluate moves, isn't this the case ?
MCTS is not a traditional depth first minimax framework. Key concepts like alpha-beta don’t apply. Although it is proven to converge to minimax in the limit, the game trees are so large this is not relevant. You could use the network in a minimax searcher, but it’s so much slower than a conventional evaluation function it’s unlikely to be competitive.
Very roughly you can think of AlphaZero as a best-first tree search, where 'best' is some statistical estimate.
It is kind of the case, but it does not need to expand the whole node to find the maximum. It samples some children instead from a NN (the Monte Carlo aspect)
I strongly suspect alphazero is easily beatable, once you have your hands on it. This is just from experience that most neural network style systems are weak against adversarial opponents who understand their internals.

Of course I can't be sure, because Google refuses to give out anyone access to alphazero, or a network trained with it. Personally, that gives me more confidence they know there are significant exploitable weaknesses.

No need to wait for AlphaZero, you can try Leela Chess Zero today. From my experience the network without search has some blind spots, but the tree search is pretty effective in fixing them.
Adversarial? If the model exclusively trains against itself, you can’t really insert anything there. Do you mean, play confusing moves at the beginning of the game?
I mean, if we had the network, it would be easy to beat, the same way you can confuse image recognition systems with very minor changes.
The way general adversarial networks work on tricking image recognition systems is that they vary pixels of the input image slightly to manipulate the output of the neural network.

For alphazero, the input is the board, which you can't manipulate arbitrarily. You can run an evaluation of a board based on a move and see if its significantly different than the evaluation that alphazero comes up with, and maybe try to exploit that. But if you have a better evaluation of some state than that of alphazero, you're likely a stronger player anyway so this extra step is unnecessary. Most of the value of the bot comes from the evaluation function of a board, along with some hyper-parameters. But the evaluation is probably the most important part and the most difficult to replicate.

That doesn't follow. For you to confuse it, you need to change the inputs. For images, this is fine, we can smoothly change lots of little things. For chess games or go you don't have that freedom.

You can download the weights for LCZero right now though and try out your theory. https://github.com/LeelaChessZero/lc0/wiki/Getting-Started

You are right, I should try. I'll see if I can find time in the new year.

I'd prefer to try with a go player, because as you say, in chess it's hard to exactly control the input to the network, it's easier in Go.

Here's a go setup https://github.com/gcp/leela-zero

There's current best weights available. Not alphazero, but I would expect that issues would be general and so if there are issues with leela zero they may transfer and if you don't see issues with leela zero they're unlikely to exist in alpha zero (at least, if they do they may be very particular to subtle training differences).

Would be very interested to see what you find if you get the chance.

You can change the inputs: it depends on when (ply) and which move you play. Some moves are uncommon enough to make it possible for you to uncover something?
You absolutely can change the inputs, but the point I wanted to make is that unlike images where you can make a human-irrelevant changes you can't really do that with chess or go.

If you want to construct a particular position on the board, you'd likely need to use multiple steps, require the AI to play very particular moves and then the outcome would be a certain move from the AI. Even then, a simple incorrect classification doesn't help all that much, you need your opponent to make repeated mistakes.

I think in reality if you uncovered a type of move it wasn't expecting you are likely to uncover a new strategy in general rather than a trick. Image classification however lets you play uninterrupted with tiny pixel value changes, and you only need a single incorrect output to "win".

It's suspect it's a bit harder for the network to be overfit like this, but it's probably possible it has some gaps in its evaluation. However, those gaps would have to persist beyond its search horizon and not concretely affect material or mobility and it just seems vanishingly unlikely you'll find any systematic way to exploit anything.
I guess if you understand the internal of a NN you can just write a paper to publish it.

Generating the right noise was proven to be successful against NNs (https://blog.openai.com/adversarial-example-research/) but I am not sure how could you apply that to this context.