Super human Stratego with RL and test time search

Y	Hacker News new \| ask \| show \| jobs

	Super human Stratego with RL and test time search (arxiv.org)
	2 points by algo_trader 218 days ago

1 comments

algo_trader 218 days ago

Only 2000 GPU hours Heavily customized network 95% win rate in recent human tournament sample Several training techniques for evaluation/learning rate

link