| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Vetch 1654 days ago

This sort of evasiveness around speaking on method limitations, down playing or de-emphasizing related work but boosting senior authors previous work is standard academic fare. It's partly a strategy against novelty nitpickers and results in a net negative for all.

I also suspect part of the reason they chose Stockfish 8 was as a basis of comparison with AlphaZero. Their baselines for Go and poker are also pretty weak so their emphasis is clearly on displaying generality and reduced domain specialized input, not supremacy.

A single algorithm to play perfect and imperfect information games is difficult to achieve. Standard depth limited solvers and self-play RL result in highly exploitable agents. PoG appears to be very strong at Chess, decently strong at Go and decent at Poker (Facebook AI's ReBeL, the strongest prior work in this area, performed better against slumbot). What's unique about PoG is its ability to also play an imperfect information game (Scotland Yard) that has many rounds and a relatively long horizon (although it still has scaling issues).