Hacker News new | ask | show | jobs
by narsil 3737 days ago
Agreed. The DeepMind team have already done this with other Atari games: https://youtu.be/08Cl7ii6viY?t=11m35s (Oct 2015)

"The system only gets raw pixels as input."

The results were pretty great, so it would be fascinating to see this work with Matt's version of SC2 as mentioned elsewhere in this thread: https://news.ycombinator.com/item?id=11326119

2 comments

Raw pixels. And the score. The score was separate, or rather a signal representing increased score. Relevant quote from the paper:

"The emulator’s internal state is not observed by the agent; instead it observes an image xt ∈ Rd from the emulator, which is a vector of raw pixel values representing the current screen. In addition it receives a reward rt representing the change in game score."

The paper: http://arxiv.org/abs/1312.5602

It would be great if it could also learn to tell if it was doing fine.
Agreed. I think that is one of the major drawbacks/limitation/unaddressed aspects of deep learning algorithms -- they are primarily supervised learning. Supervised in the sense that you have to explicitly identify good and bad examples. Determination of what is good and bad itself (figuring out that number at the upper right side of the screen is a score) would be a major breakthrough with implications far beyond game playing. DNN has been a breakthrough with much better accuracy and discrimination capabilities of a complex neural network. It still requires that the researcher point out what is good and bad. We still need a just-as-significant breakthrough in unsupervised learning.
I imagine this would be massively more difficult to do with SC2, due to the much higher variation in pixels.