Hacker News new | ask | show | jobs
by levocardia 364 days ago
It's clearly written for an audience of other RL researchers, given than the conclusion is "will someone please come up with Q-learning methods that scale!"