| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tromp 2190 days ago

The implementation includes Connect Four as an example application. While the standard board size of 7x6 is indeed solved, as they note, and in fact all sizes up to 8x8 are [1], they could have picked 9x8 or 9x9 which are currently unsolved. The latter is the new standard size on Little Golem which upgraded from 8x8 when that was solved.

[1] https://tromp.github.io/c4/c4.html

[2] http://www.littlegolem.net/jsp/games/gamedetail.jsp? gtid=fir

[3] http://www.littlegolem.net/jsp/forum/topic2.jsp?forum=80&top...

1 comments

jonath_laurent 2190 days ago

I completely agree with you. Let me just add two remarks. First, although picking 9x9 boards makes connect-four intractable for bruteforce search indeed, I would be suprised if it made it much more difficult for AlphaZero, which relies on the generalization capabilities of the network anyway. Second, using a solved game for the tutorial is a feature, not a bug. This allows precise benchmarking of the resulting agent as a ground truth is known.

link

dnautics 2190 days ago

That's really cool and I didn't think of that. I just wanted clarification: that means you train the agent without the deterministic solution and your "validation/test" (I'm not sure what those phases are called in unsupervised learning) sets are done without the deterministic solution.

link

jonath_laurent 2190 days ago

Yes, the agent is trained without access to the deterministic solution.

link

tromp 2190 days ago

I did not see an evaluation of how close to perfection the agent becomes. Did you compute any sort of error rate (by finding moves that turn a won position into a non-won one or a drawn position into a lost one) ? And how this error rate drops over time as learning advances? That would indeed be very interesting to see.

link

vishvananda 2190 days ago

My team did an implementation of alpha zero connect four a couple of years ago. Our findings are in a series of blog posts starting at https://medium.com/oracledevs/lessons-from-implementing-alph.... We didn't manage to get to perfection either on policy, but got pretty close. You can play against some versions of the network here: https://azfour.com

link

jonath_laurent 2190 days ago

Your series of blog articles has been an important source of inspiration in writing AlphaZero.jl and I cite it frequently in the documentation. Thanks to you and your team!

link

jonath_laurent 2190 days ago

Such an evaluation is available in the tutorial: https://jonathan-laurent.github.io/AlphaZero.jl/dev/tutorial...

Admittedly, the connect four agent is still far from perfect but there is a lot of margin for improvement as I have done very little hyperparameters tuning so far.

link