|
|
|
|
|
by jasisz
3935 days ago
|
|
This algorithm (in it's regular form used often in games and examples) has one interesting "downside" I was exploring some time ago - selection is performed using the UCB formula. So basically it tries to maximize the player payout. But in the most games this is in fact impractical assumption, because we end up tending to expand branches that will be most likely not chosen by our opponent. As in the example (I assume gray moves are "our" moves) - we will much more likely choose to expand 5/6 branch instead of the 2/4, that will be in fact more likely chosen by our opponent. |
|