|
|
|
|
|
by sinwave
4214 days ago
|
|
While this complaint generally has validity, their paper [1] does IMO present an advance; it's not just handing a bunch of labeled data off to a large neural network. IIRC (forgive me, I read the paper a few weeks ago) the solution is at its core a reinforcement learning system, with the deep net only making up the component that predicts reward from a (state, action) pair. With that in hand, there remains the non-trivial RL problem of balancing "exploration vs exploitation" in learning good strategies to play the game(s). While NN's have been used in this capacity before, I believe that, as other comments have mentioned, using a deep net to learn to map a high-dimension state-action space (e.g,the state of the game represented as pixels of the screen at a particular time) to expected reward in real time was indeed an advance, both theoretical and technical. And, oh yeah, I just remembered that a University of Texas research group is doing work in this area too (there was a recent paper [2] from Peter Stone and others). (Edited for clarity) (Edited again to suggest another paper). [1] - http://arxiv.org/pdf/1312.5602.pdf [2] - http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/TCIAI... |
|