| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gjstein 2066 days ago
	The "search" here refers to the idea that, in principle, you could search the entire space of possible hands of cards and exhaustively predict the optimal action by imagining every hand. However, like in the game of Go, this is computationally intractable, so instead they use machine learning to "guide search" towards more promising "moves". In AlphaGo (and here) this learning happened as part of a reinforcement learning pipeline. There was some discussion of this in the AlphaGo Zero blog post from a while back: https://deepmind.com/blog/article/alphago-zero-starting-scra...