Hacker News new | ask | show | jobs
by skywhopper 3251 days ago
The headline seems to misuse the word "semantic" (not to mention "understand"). Does the door-opening robot now understand how to open all hinged doors with a similar opening mechanism? Or was it just trained to imitate a sequence of changes in a 2D image from a fixed angle? Can the same software and robot also be taught to open windows? Boxes? We are talking about "semantics" explicitly here. Does it understand "open" versus "closed" for these different types of closures/portals?

I don't want to discount the value of this research. It's absolutely necessary to do this sort of basic proof-of-concept testing of these ideas. But the claim being made implicitly here is way beyond what's actually going on. The software understands nothing, and the "semantics" extend to simple image-matching of objects, but there's no deeper meaning associated with the labels, so I think calling that "semantics" is a major stretch.

This approach is not going to teach a robot how to pick fruit, or serve food, or clean floors anytime soon. In the best case where this is even a workable approach, research like this is just the first of millions more tiny steps along the path. Anyway I think it's naive to assume that a good way to approach automation is to write software to let robots learn by watching humans do the desired task. As cool as that sounds, chances are that approach would ultimately be a massively inefficient way to solve the problem. It'd be like trying to invent the automobile by building a steam-powered horse robot that can tow carriages. The critical purpose is being overlooked in favor of a cool-looking but totally impractical toy demo.

3 comments

That seems to be what they are aiming towards but true AI is not the same as Machine Learning.

Google is still using what we could call a very rudimentary form of AI as they describe "Unsupervised learning on very small datasets is one of the most challenging scenarios in machine learning. To make this feasible, we use deep visual features from a large network trained for image recognition on ImageNet".

Interesting points. What would you consider an adequate task that demonstrates a program/robot has acquired "semantic understanding"?
Understand instruction in natural language.

For example:

1.'grab that red ball'

2.'turn the handle on the door 90 degree then pull it out'.

Just like how people would do it, a video and a piece of instruction listed and a label to indicate whether this task is a success or not. Then you show a different setting and a new instruction, if the model successfully generalize and understand the semantics behind it, it should carry out the instruction successfully.

I tied a rope to my door handle and put a treat on the other side of the door. I tried for half an hour to get my dog to learn how to open the door. I demonstrated it many times. I guided his motions through the procedure. I put a treat on the rope to encourage him to interact with it. He moved the door chewing on the treat, but still didn't learn how to open the door on purpose. I gave up.

So yeah maybe it's not anywhere near human intelligence. But its still cool they've made a robot smarter than my dog.

I think this says more about your lack of knowledge. See https://www.youtube.com/watch?v=QKSvu3mj-14 for more information.

EDIT: To provide a bit more information. I provide a video because I think it is much more impactful to see what happens than to read it. In this video you can see how an appropriate history of reinforcement will lead to very complex behavior in simple animals. By complex I mean behavior like "talking" and "problem solving".

Here is the 2nd part: https://www.youtube.com/watch?v=erhmslcHvaw