Hacker News new | ask | show | jobs
by gwern 911 days ago
The standard AlphaZero doesn't handle that. For that you'd need to graduate to more complex variants like the aforementioned ReBeL, AlphaZe* https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10213697/ or BetaZero https://arxiv.org/abs/2306.00249 or ExIt-OOS https://arxiv.org/abs/1808.10120 or Player of Games https://arxiv.org/abs/2112.03178#deepmind .

(You could also move straight to MuZero variations: https://arxiv.org/abs/2106.04615#deepmind https://openreview.net/forum?id=X6D9bAHhBQ1#deepmind https://openreview.net/forum?id=QnzSSoqmAvB )