Hacker News new | ask | show | jobs
by 1024core 3758 days ago
What if the "betaGo" played just AlphaGo, and learned from its games?

BTW: even humans don't just randomly pick up the game. They have teachers, who teach them the tricks of the trade and monitor their games.

1 comments

That's already a known method to transfer "knowledge" from one model to another. I should double-check before quoting a paper, but I think that this one talks about this (http://arxiv.org/abs/1503.02531).

You train many models. Then you "distill" their predictions into one model by using the multiple predictions (from many models) as targets (for the single model trained afterwards).

You're right to point out that humans don't do that.

I think it would be "cheating" if you train BetaGo on AlphaGo, for the purposes for doing that experiment. The goal would be to have some kind of "clean room" where people fumble around.

Of course, you can also run the other experiment to see how fast you can bootstrap BetaGo from AlphaGo. That's also interesting.