Hacker News new | ask | show | jobs
by noir_lord 2731 days ago
That was the first paper.

Second paper they played fair even giving stockfish a 10 to 1 advantage on same hardware as tcec uses.

2 comments

You can still run the games past the exact commit of Stockfish they used and it finds blunders in its own play, so it still feels like there's a lack of transparency. But I don't think anyone strongly believes AlphaZero isn't the best at this point.
Like which move? I wonder why stockfish developers do not claim any of this?
You can checkout the exact commit of Stockfish from the paper and perform the analysis of the published games yourself. I doubt the Stockfish developers have bothered because it's an old version.

Can I just state again, because of your aggressive tone in multiple comments now, that I do believe AlphaZero is stronger, I don't believe there are real shenanigans going on, but it's _still_ sad that we can't reliably, publicly verify this stuff.

You are the one making this claim: "You can still run the games past the exact commit of Stockfish they used and it finds blunders in its own play, so it still feels like there's a lack of transparency."

This claim implies foul play, I asked for a shred of evidence. Call me aggresive if you want, I just can't stand this kind of bullshit.

It's not bullshit, you can download the version of Stockfish and ask it to analyse the games! I did this with scid, you can too. Around move 27 in game one is one example. I don't intend to repeat the tedious process because my curiosity is satisfied, and you're just being obnoxious.
Did you really emulate all the conditions published in the page? I seriously doubt it. Otherwise all your claims are speculation and moot.
(And I know in the past, before reading the paper, I've assumed they were cheating this time because they made so little effort the first time around, but I'm happy to admit I was wrong).
glinscott said "We need a public exhibition match to settle the score, ideally with some GM commentary." above.

As for the GM side, Nakamura basically said the same.

Still, zero proof of foul play. But sure i would very much enjoy watching Alphazero destroy Stockfish in a public match.
Yeah, I'm certainly not claiming AlphaZero isn't the stronger engine (probably by quite a substantial margin). The first match _was_ a pretty clear fix, but the second paper is an entirely reasonable setup, so I have no real suspicions about the process on an ongoing basis, just a shame nobody can prove it for themselves.
Paper is patient. We won't know until AlphaZero participates in a fair tournament, like Leela, which recently came in third after Stockfish.