Hacker News new | ask | show | jobs
by feral 4479 days ago
I agree that all approaches are approximations, and which would actually perform best in this game is an empirical question. It'd take some work to really explore and tune either minimax or MC approach, so I wouldn't throw either out due to one failed attempt.

I accept the argument that "minimax is conservative, and conservative is good" might be correct. But I don't think its likely, and, without the time to code my own solution, all I can do is give arguments to that end.

I gave one intuitive argument here: https://news.ycombinator.com/item?id=7381382

Another argument is to remember that minimax, and AB-pruning, is a really strong way of reducing your search space - because of how unfavourable adversary moves are propagated up the tree - which could result in drastic pruning if the minimax assumption is wrong.

One 'bad' state, 6 or 8 ply deep through your branching factor ~10 tree, can result in you pruning entire lines of enquiry using minimax AB; surely that can't be right if the chance of the bad state happening is tiny, and especially if an alternative is chosen which isn't much better.

So I still think that if you want to tune the search algorithm to be risk adverse, then, yes, do so; but minimax is a drastic way to achieve that. But yes, how big of an effect that decision has in practice depends on complicated things, such as correlations in the search tree. (i.e. If you find a bad state in a section of the game tree, maybe there are likely to be other bad states nearby, so its not such a big deal to prune that whole section using minimax).

2 comments

I was initially surprised that minimax worked at all for the exact reasons you cite. But it actually matches my intuition from playing the game kind of a lot. I think it's generally the case that most "opponent" moves are completely OK but there's one move that results in a high probability of loss (the one that's behind your "stack"). Avoiding worse case scenarios is really the entire game (i.e. your example wherein low probability of certain loss is being compared unfavorably to a guaranteed 99% loss rate does not actually come up). So if you built a really good, comprehensive MC approach, it might just reduce itself to minimax anyway. Thus perhaps the minimax is just a simpler way to implement the right approach from the get-go.
Actually I thought about it more. There are a few approaches

so you only need to decide 1 of 4 moves at the beginning

1. min max to a finite horizon using a heuristic utility function (as implemented)

2. Dynamic program/MCMC to a finite horizon and use the heuristic. Good at modelling the opponent behaviour, but could lead to bad results with a bad heuristic. (commented out approach)

3. Sample till the game ends (infinite horizon), pick the first move that lead to the game that went the longest (or won). This avoids developing an ad hoc heuristic.

So now I vote for 3. :p