Hacker News new | ask | show | jobs
by pjrule 2935 days ago
As someone working on a reinforcement learning/neuroevolution problem right now, I find this to be extremely exciting. Fewer parameters, ceteris paribus, is always better—the fact that the experiments in this paper were run on one workstation, rather than on a massive farm of TPUs à la AlphaGo, implies quicker development iteration time and more accessibility to the average researcher.

The staging of components in this paper (compressor/controller), where neuroevolution is only applied to a low-dimensional controller, reminds me of Ha and Schmidhuber's recent paper on world models (which is briefly cited) [1]. They employ a variational autoencoder with ~4.4M parameters, an RNN with ~1.7M parameters, and a final controller with just 1,088 parameters! Though it's recently been shown that neuroevolution can scale to millions of parameters [2], the technique of applying evolution to as few parameters as possible and supplementing with either autoencoders or vector quantization seems to be gaining traction. I hope to apply some of the ideas in this paper to multiple co-evolving agents...

[1]. https://worldmodels.github.io

[2]. https://arxiv.org/abs/1712.06567

2 comments

You may be interested in an even older paper: http://www.idsia.ch/~juergen/icdl2011cuccu.pdf
Thanks so much! I read this (and a few related papers) today. Besides the novel algorithm discussed in the new Atari paper, do you have a reference implementation of online vector quantization you might be able to recommend? I think I could probably figure it out from the paper alone, but sometimes it's nice to see code other people have already optimized. :)
Uhm unfortunately I do not, I could search for some on Google but I doubt I would fare better than you at it. I went to code my own version, it is quite straightforward. You can find it here: https://github.com/giuse/machine_learning_workbench/blob/mas... although polluted by research's trial and error, you can easily check the minimal code necessary to run. Here's an example of how to use it: https://github.com/giuse/machine_learning_workbench/blob/mas... Let me know if that works for you or if you have further questions!
That’s excellent! Thanks!
>I hope to apply some of the ideas in this paper to multiple co-evolving agents...

care to elaborate?