|
The poker analogy seems like the right one to use, although Pokemon is made messier by the level of variance. (Meaning both "semi-random effects" and also "far more than 52 possibilities for mon and moves".) I'd imagine the completely-hidden playstyles would be incredibly hard for an AI to learn, but the popular Showdown style that has team preview might be workable. The poker analogy seems like a good one, at least for studying the sorts of things an agent would need to do. There's definitely a recognizable 'tempo' to pokemon, where A picks a move that threatens B, B switches to something that can take it and threaten back, then A in turn switches to take the hit and threaten back. Which, much like just accurately betting your hand strength in poker, is enough to beat a lot of amateurs. The metaphor goes from there - though I might use 'raise' for leaving a threatened pokemon exposed, which lets us differentiate a strong hand ("I'll use a coverage move with higher speed") from a bluff ("I can hit his switch if I call it.") As an example, opening Koko v Landorus. The fold is switching Koko to Skarmory, the honest raise is HP Ice, and the bluff is Thunderbolt. The basic ebb and flow of the game seems like it's that and one more layer - double switches and attempts to predict them. Above that, there's just not enough probability mass left to benefit from trying to triple switch, counter-counter-switch, and so on. Of course, it's all made vastly more complicated by trying to trap, set hazards or status, and make space for setup moves. I'm not sure what it would take to get an unsupervised learner to value e.g. Rocks appropriately. My experience has been that neural nets struggle badly on assessing that sort of long term state change, though of course I'm not working at OpenAI or DeepMind levels. |
I'd argue that the raise is U-Turn :-). Which instant-wins any switching contest (ex: U-Turn on the switch, leaving the option to switch into Magnezone to trap the Skarmory, or if Lando stays in you can switch to your dedicated Lando counter... not that Lando really has a solid counter mind-you, but you get the idea.).
The U-Turn war however, between Lando and Koko demonstrates the bluffing game once again. Koko staying in and doing something weird like Calm Mind, or even Reflect/Light Screen would be absurd, but it would definitely beat the Lando U-Turn in most cases.