| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HarHarVeryFunny 471 days ago

First there was AlphaGo, which had learnt from human games, then further improved from self-play, then there was AlphaGo Zero which taught itself from scratch just by self-play, not using any human data at all.

Game programs like AlphaGo and AlphaZero (chess) are all brute force at core - using MCTS (Monte Carlo Tree Search) to project all potential branching game continuations many moves ahead. Where the intelligence/heuristics comes to play is in pruning away unpromising branches from this expanding tree to keep the search space under control; this is done by using a board evaluation function to assess the strength of a given considered board position and assess if it is worth continuing to evaluate that potential line of play.

In DeepBlue (old IBM "chess computer" that beat Kasparov) the board evalation function was hand written using human chess expertise. In modern neural-net based engines such as AlphaGo and AlphaZero, the board evaluation function is learnt - either from human games and/or from self-play, learning what positions lead to winning outcomes.

So, not just brute force, but that (MCTS) is still the core of the algorithm.

2 comments

bubblyworld 471 days ago

This a somewhat uninteresting matter of semantics, but I think brute force generally refers to exhaustive search. MCTS is not brute force for that very reason (the vast majority of branches are never searched at all).

link

HarHarVeryFunny 471 days ago

OK, but I think it's generally understood that exhaustive search is not feasible for games like Chess and Go, so when "brute force" is used in this context it means an emphasis on deep search and number of positions evaluated rather than the human approach where many orders of magnitude less positions are evaluated.

link

bubblyworld 471 days ago

I think that kind of erodes the meaning of the phrase. A typical MCTS run for alphazero would evaluate what, like 1024 rollouts? Maybe less? That's a drop in the ocean compared to the number of states available in chess. If you call that brute force then basically everything is.

I've personally viewed well over a hundred thousand rollouts in my training as a chess bot =P

link

visarga 471 days ago

> Game programs like AlphaGo and AlphaZero (chess) are all brute force at core -

What do you call 2500 years of human game play if not brute force? Cultural evolution took 300K years, quite a lot of resources if you ask me.

link

HarHarVeryFunny 471 days ago

That 2500 years of game play is reflected in chess theory and book openings, what you might consider as pre-training vs test time compute.

A human grandmaster might calculate 20-ply ahead, but only for a very limited number of lines, unlike a computer engine that may evaluate millions of positions for each move.

Pattern matching vs search (brute force) is a trade off in games like Chess and Go, and humans and MCTS-based engines are at opposite ends of the spectrum.

link

beepbooptheory 471 days ago

Either you missed an /s or I am very interested to hear you unpack this a little bit. If you are serious, it just turns "brute force" into a kind of empty signifier anyway.

What do you call the attraction of bodies if not love? What is an insect if not a little human?

link