Hacker News new | ask | show | jobs
by backpropaganda 2906 days ago
It also assumes access to the simulator, which is an even more problematic assumption. That's like saying you're doing image classification but assuming access to the 3D model which generated the image.
1 comments

I think that analogy is a bit bogus, but if you want to make it, it's more like assuming access to a function that renders the 3D model from a variety of perspectives on command, not having access to the model itself.

(Because the RL algorithm doesn't have access to the rules by which the simulation is carried out, it only has access to the commands and the result.)

And frankly, that would be a perfectly fair and interesting classification problem, so I don't see your point.

Otherwise, how exactly do you propose learning to drive a simulation without access to the simulation? I really don't know what you're saying here.

My point is that the two problems are quite distinct. This is not a small change to how the problem is being solved, but a complete change of the problem itself. Further the change significantly limits the feasibility of the solution, which is not sufficiently made clear by the authors of the blog post. Casual followers of AI/RL research might think that this is a significant progress, while in fact it's actually a progress on a problem that hasn't really received any attention due its uselessness. I think there may be 1-2 papers which might have experiments on this problem while probably 100s in the model-free problem.

Thanks for your analogy though. I agree that it's better than mine. I was only trying to give a rough idea, but I'll use your analogy if I have to now. :)