| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aprescott 3793 days ago
	It's worth remembering that AlphaGo was partially, initially trained on the previous game records of professional players, to form the basis of its policy network and so on. AlphaGo's victory over Lee Sedol is impressive, but it does rest on the combined total experience and long study of humans. That's why it's even more exciting to me to hear that one of the DeepMind team's next efforts will be to see how AlphaGo plays when it starts learning "from scratch".

3 comments

keehun 3793 days ago

I agree with the excitement. A possible outcome is that the AlphaGo trained "from scratch" will not be as strong--but another possibility is that by eschewing any bias for human tradition, perhaps it will come up with even better strategies that no one has thought of before.

link

nickpsecurity 3793 days ago

This was my argument against the StarCraft claim. If it wins, it will likely have just imitated human pros' thousands to millions of games like a smarter version of old chatterbots.

Whereas, good human players learned as they went with far fewer matches, trial and error, planning, and learning from pto games sometimes. We can even go from Age of Empires to Starcraft with little prrparation and still do OK.

link

seanwilson 3793 days ago

Can anyone elaborate on how they would teach it from scratch? Would this mean only giving it the ability to play valid moves (and against itself) and give it access to the final score?

link

eru 3793 days ago

Basically. They'd start with random moves.

That worked with backgammon without any problems. (http://www.bkgm.com/articles/tesauro/tdl.html)

In Go, a naive approach would probably also work, but would take ages. There's probably a slightly smarter approach possible.

link