| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fallingfrog 3755 days ago
	I think that's what the AlphaGo team did - they trained their agent against itself, and it learned new moves not explicitly programmed in! With an evaluation function just saying ahead / not ahead.