Hacker News new | ask | show | jobs
by sashank_1509 1005 days ago
Flipping a pancake was done in 2010. What looks impressive for humans is easy for robots and vice versa: https://youtu.be/W_gxLKSsSIE?si=HDyNXe1Ys_eFXiVU Another case in point: robot juggling was done in 1990s and to date we do not have a robot that can open any door reliably like a human. Kind of like Moravecs Paradox
1 comments

To be fair it is far more complex for a robot to grip a spatula and use that spatula on a griddle than to use dynamic motion to flip a pancake in a pan.
Ehhh.

Solving any one problem with robotic manipulation isn’t all that hard. It takes a lot of trial and error, but in general if the task is constrained you can solve it reliably. The trick is to solve *new* tasks without resorting to all that fine tuning every time. Which is what Russ is claiming here. He’s training an LLM with a corpus of one-off policies for solving specific manipulation tasks, and claiming to get robust ad hoc policies from it for previously unsolved tasks.

If this actually works, it’s pretty important. But that’s the core claim: that he can solve ad hoc tasks without training or hand tuning.

  > He’s training an LLM with a corpus of one-off policies for solving specific manipulation tasks, and claiming to get robust ad hoc policies from it for previously unsolved tasks.
It seems clear that many people do not understand that this is the key breakthrough: solving arbitrary tasks after learning previous, unrelated tasks.

In my opinion that really is a good definition of intelligence, and puts this technique at the forefront of machine intelligence.

Is the pancake and spatula problem actually that constrained though?

I know it isn’t as open ended as plenty of more important problems in robotics, but this doesn’t strike me as easy at all.

I’ve only dabbled in robotics as an entry level hobbiest, so I really don’t know the answer.

It’s constrained enough to be tractable.
Fair enough. When would you say it stops being tractable? What single, practical thing could we add to this problem to make intractable?
Flipping a pancake in a "random kitchen" would be much more difficult and have many of the same issues as the door problem.

It's hard to point to a single thing that would make "flipping pancakes" intractable, it's sort of the other way around, to usefully flip pancakes in the same way as a person takes a lot of skills chained together.

The "door problem" is a sort of compendium of many real-world skills, identifying the door, understanding its affordances and how to grip / manipulate them, whether to push or pull the door, predicting the trajectory of the door when opened, estimating the mass of the door and applying the right amount of force, understanding if there any springs or pulls on the door and how it must be held to traverse through it. Etc. There are also a ton of things I'm missing that are so fundamental one tends to take them for granted, like knowing your own size and that you can't fit through a tiny doorway.

I think you can ramp towards the "door problem" in difficulty by slowly relaxing constraints. A video linked above (not article) shows "can flip a pancake successfully with a particular pan (you are already holding) and pancake with a fixed camera and visual markers". Ok, now do it in varying lighting conditions. With no visual markers. With different camera views. Different pancakes. Real pancakes (which are not rigid, and sometimes stick to the pan). Different pans. Now you have to pick up the pan. Use a stove. Different stoves. Identify griddle vs pan and use the right flipping technique. Find everything and do it all in a messy kitchen... eventually you're getting to same ballpark as the "door problem".