|
|
|
|
|
by gwern
3405 days ago
|
|
I'm not nearly that pessimistic. Beating SSBM is well within the capability of a well-tuned A3C, and definitely within the capabilities of a group like DeepMind. More neuromorphic hardware is unnecessary and with current RL methods, they are more CPU-bound than GPU-bound (take a look at the NN they use, it's trivially small; most of the computation goes towards running many SSB games in parallel in order to generate any data to do some small updates on the NN). I believe they've handicapped themselves, actually, with their shortcuts: the performance of agents is crippled by the inability to see projectiles due to the choice to avoid learning from pixels (which I bet would actually be quite fast, as learning from pixels is not the bottleneck in ALE), and likewise the use of the other RAM features is the path of the Dark Side - allowing immediate quick learning through huge dimensionality reduction, seductively simple, yes, yet poison in the end as the agent is unable to learn all the other things it would've learned (such as projectiles). I suspect that this is why their current implementation is unable to learn to play multiple characters: because it can't see which character it is and what play style it should use. So I would not be surprised at all to hear in a year or two that human-delay-equivalent agent using raw pixels could beat human champs routinely. |
|
In fact, RAM features are likely to be much more useful for model-based approaches, which may be important for solving the action-delay problems.
As for multiple characters, the character ID is available to the network. I doubt pixels will be help there either.