|
|
|
|
|
by optimalsolver
1493 days ago
|
|
>and given demonstrations of a successful agent wouldn't be hard Last I checked, the only team that has shown good performance on that game is Uber, and from what I recall they used a controversial hack that would be unlikely to generalize to other environments. |
|
However, once the exploring was done, they could train an agent on the trajectories of the exploring agent to solve MR with no problem. That's why I say that MR is an exploration problem and training on demonstrations from a player which has already solved MR would obviously work - because it does. So it doesn't show anything interesting about Gato, because Gato would be solving the part of MR that everyone is agreed is basically trivially easy.