Hacker News new | ask | show | jobs
by minimaxir 2869 days ago
Per the architecture, the model does use a CNN to process minimap data.
1 comments

Yes but the comment you're replying to is saying that it would be interesting to only rely on information from visual input on the screen, rather than on getting (for example) the absolute XY position of every player as a direct input to the network.
It's like writing a bot that would play chess using video feed from camera. Yes, it's doable and an interesting problem on it's own but completely unrelated to what openai is doing.
Chess is a poor example because it's a turn-based game whereas Dota is real-time and incredibly fast-paced. Visually parsing a chessboard to know which piece is which is trivial, and also you have all the time in the world to do it (between turns). In Dota things happen so quickly, particle effects pop off all over the place, and through all of it, you have to constantly manually re-place your camera in the optimal position.

In chess, both players always have perfect information about the game-state, and this is far from the case in Dota. OpenAI does account for fog of war, so it's not COMPLETELY omniscient, but it is still more omniscient than human players ever have the ability to be, without having to fiddle with the camera etc.

yes I see what you are saying with the chess example, but I think in this case adding the visual layer actually adds interesting problems related to what openAI is trying to do. See my reply to lawrenceyan above.