| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Cybiote 2986 days ago
	Yeah, it's the same general principle of using a model to cheaply speed up policy learning. An advantage to their approach however, is that it learns a latent space and generalizes better. The VAE learns a compressed vector and the latent variables are somewhat meaningful. The VAE can also be sampled from and is not just a table of memorized examples. The RNN maintains coherence with actions and observations of previous time-steps and a separate controller is also learned. The end result is their approach is richer and more flexible.