| > Pure vision will never be enough because it does not contain information about the physical feedback like pressure and touch, or the strength required to perform a task. I'm not sure that's necessarily true for a lot of tasks. A good way to measure this in your head is this: "If you were given remote control of two robot arms, and just one camera to look through, how many different tasks do you think you could complete successfully?" When you start thinking about it, you realize there are a lot of things you could do with just the arms and one camera, because you as a human have really good intuition about the world. It therefore follows that robots should be able to learn with just RGB images too! Counterexamples would be things like grabbing an egg without crushing, perhaps. Though I suspect that could also be done with just vision. |
I don't see how that follows. Humans have trained by experimenting with actually manipulating things, not just by vision. It's not clear at all that someone who had gained intuition about the world exclusively by looking at it would have any success with mechanical arms.