Hacker News new | ask | show | jobs
by ebalit 3256 days ago
In AlphaGo, MCTS is used to explore many plans and select the best. As far as I know, it then execute only the first action of the selected plan, and start a new planning for the next action. As such, it doesn't "stick to the plan", so you could say that it doesn't have a strategy. But the MCTS is definitely a planner.
1 comments

Yes absolutely, I think your explication is perfectly correct.

Though (IMHO) MCTS is better characterised as evaluating moves rather than exploring plans.

The MCTS only explores the moves in order of likelyhood using the most basic of heuristics, random playout.

The Net outputs likely moves based only the current board position, it formulates no strategy.

No state is stored across moves - each play is independent, relying only on the current board position.

I still don't see anything anywhere in AlphaGo that is a plan, trajectory or strategy.

Neither is there an evaluation of the opponent nor any attempt to outwit them.

That it performs so astonishingly well without a plan is very very interesting and should perhaps give us pause - is planning a hubris ? Do we undervalue our use of heuristics in our own behaviour ?