Y
Hacker News
new
|
ask
|
show
|
jobs
by
ilyasut
2901 days ago
There has been a fair bit of past work exploring the idea you described (examples:
https://arxiv.org/pdf/1606.01868.pdf
,
https://arxiv.org/pdf/1703.01310.pdf
,
https://pathak22.github.io/noreward-rl/resources/icml17.pdf
,
https://openreview.net/forum?id=H1RPJf5Tz
). Such methods can't solve games like Montezuma's revenge to a comparable level of performance yet, but I'm sure they'll eventually get there.