| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rytill 3041 days ago
	The AlphaZero algorithm (monte carlo tree search with value estimator trained by reinforcement learning) works on any environment you can simulate during play time, single player or not.

1 comments

eutectic 3041 days ago

Any environment with finite action and state-spaces.

link

rytill 3030 days ago

No, the key requirement which makes it difficult to use on real-world tasks is that you must be able to do a forward rollout of your environment in your decision-making process.

link