| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ilyasut 2948 days ago
	There has been a fair bit of past work exploring the idea you described (examples: https://arxiv.org/pdf/1606.01868.pdf, https://arxiv.org/pdf/1703.01310.pdf, https://pathak22.github.io/noreward-rl/resources/icml17.pdf, https://openreview.net/forum?id=H1RPJf5Tz). Such methods can't solve games like Montezuma's revenge to a comparable level of performance yet, but I'm sure they'll eventually get there.