Hacker News new | ask | show | jobs
by shannonpig 3518 days ago
I think you're thinking about the difference between full information games vs partial information games, rather than stateless games (the state of a starcraft game is stored on a central server, the computer just has no access to it).

There's already been excellent progress in poker AI based on traditional game theory, see http://poker.cs.ualberta.ca/.

2 comments

It's not just the limited information space, however. There's also the solution space or "action space". A poker game-space is quite narrow compared to that of StarCraft because of pixel-wise differences. This is the main reason that makes StarCraft "much more realistic". In real life you have infinite moves and options with (almost) analog differences, whereas a game like Poker is much more discrete. Go is somewhere in between.

For instance, as any protoss player knows this all too well, a Zealot placed a mere pixel away from where it should could allow a stream of zerglings in, completely throwing any game plan you had up to that point out the window and enter crisis management mode. Usually this is a "mistake", which a computer may never make, but it needs to learn the relevance of such pixel-sized mistakes, which is unlikely to have as much of an impact in a poker game or Go game as it would in StarCraft.

> but it needs to learn the relevance of such pixel-sized mistakes, which is unlikely to have as much of an impact in a poker game or Go game as it would in StarCraft.

I agree with your point, but you need to look no further than the game that AlphaGo lost against Lee Sedol.

It committed a very bizarre mistake that made no sense gamewise.

You're right, that's the technical term for it, thanks. I was referring to the agent's evaluation function being stateless (i.e. it does not _need_ to persist state from previous turns in order to play optimally).

I think that there are two distinct aspects of the incomplete information that are significant; first, ignoring temporality, you have the fog of war, so you can't see the whole board. This is probably easier to address in a RNN, since you can play quite well as an amnesiac that just reacts to things that are currently visible. But you need to scout less if you have a memory of what's out there, so the amnesiac won't be able to play optimally.

Then there's the temporal aspect. The set of previous states of the game is not stored in the game and made available to the player, and so to play optimally you have to have a memory. This is where new techniques will be necessary.

These are separate problems, I think, so it will be interesting to see if DeepMind can make progress without reaching human performance on the second part.