| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jvanderbot 826 days ago

I am fond of saying there are only two hard problems in robotics: Perception and Funding. If you have a magical sensor that answers questions about the world, and have a magic box full of near-limitless money, you can easily build any robotic system you want. If perception is "processing data from sensors and users so we can make decisions about it", then there isn't much robotics left.

Got a controls problem? forward predict using the magic sensor.

Got a planning problem? just sense the world as a few matrices and plug it into an ILP or MDP.

What did the user mean? Ask the box.

etc etc. Distilling the world into the kind of input our computers require is immesnely difficult, but once that's done "My" problem (being a planning expert) is super easy. I'm often left holding the bag when things go wrong because "my" part is built last (the planning stack), and has the most visible "breaks" (the plan is bad). But it's 90% of the time traceable up to the perception, or a violated assumption about the world.

TFA is spot on - it's just not clear how to sense the world to make "programming" robotics a thing. In the way you'd "program" your computer to make lines appear on a screen or packets fly across the internet, we'd love to "program" a robot to pick up an object and put it away, but even a specious attempt to define generally what "object" and "put away" mean is still 100s of PhD theses away.So it's like we invent the entire ecosystem from scratch each time we build a new robot.

6 comments

glenngillen 825 days ago

I love this perspective.

It’s also made me draw parallels between the experiences with actual people, especially others in my household. With young children who are at the early parts of “doing household chores” of development there is basically constant refinement on what “clean the floor”, “put things away”, etc. _really_ means. I know my wife and I have different definitions on these things too. Our ability to be clear and exhaustive enough upfront on the definitions to have a complete perception and set of assumptions is basically non-existent. We’re all only human! But our willingness to engage in fixing that with humans is also high. If my kids repeatedly miss a section under some chairs when vacuuming we talk about it and know it will improve. When my Roomba does it it sucks and can’t do its job properly. Even thinking about hiring professional trades people to come do handiwork it’s rarely perfect the first time. Not because they’re bad, just because being absolutely precise about things upfront can be so difficult.

ska 825 days ago

Really there are three problem in robotics: Perception, Funding, and Cables :)

etrautmann 825 days ago

Connectors imo :)

taneq 825 days ago

And fasteners. I swear any automation system is 90% cables, connectors and fasteners by weight.

leoedin 825 days ago

Totally. I worked on the electronics in robot arms for a while and EVERY TIME there was a failure in the field - it was the cables.

smoldesu 825 days ago

Only one of them is fun to manage.

marcosdumay 825 days ago

Perception, right?

transitionnel 825 days ago

It's so great to read genuine yet experienced insight like this.

Like last night on Twitter I saw an opening for Robotic Behavior Coordinator at Figure. I know for sure, having analyzed this problem with "nothing else" to do for 20 years, I would crush it with humility, and humanity would profit in orders of magnitude.

But they are not set up to hand me control of the rounding error of $40M I'd like [and would pay forward], *nor would their teams listen to me, due to human nature and academ-uenza*.

Such is our loss.

(as you ~say, "reinventing the ecosystem from scratch...")

nharada 825 days ago

> humility

> humanity would profit in orders of magnitude

transitionnel 825 days ago

>> touché :)

>> but please believe, I would not risk ostracism on this (my favorite) forum if I were not [approaching] 100% sure.

transitionnel 825 days ago

Ah, sorry if I sounded like a douche.

Have my Y-C idea now.

here we gooooo ..!.. ;)

yakz 825 days ago

even a specious attempt to define generally what "object" and "put away" mean is still 100s of PhD theses away

Is this part still true? There are widely available APIs (and even running at home on consumer level hardware to some extent) that can pick an object out of an image, describe what it might be useful for and where it could go.

jvanderbot 825 days ago

Imagine you program a robot to "put away" a towel. Then it opens the door and finds there's a cup in the place already. Now what? Or a mouse. Or a piece of paper that looks like a towel in this lighting. Or a child.

Imagine the frustration if the robot kept returning to you saying "I cannot put this away". You'd get rid of the robot quickly. Reasoning at that level is so difficult.

But then imagine it was just a towel all along - oops, your perception system screwed up and now you put the towel in the dishwasher. Maybe this happens 1/1,000,000 times, but that person posts pictures on the internet and your company stock tanks.

kajecounterhack 825 days ago

Most robotic companies today still use traditional tracking and filtering (e.g. kalman filters) to help with associating detected objects with tracks (objects over time). Solving this in an fully differentiable / ML-first way for multiple targets is still WIP at most companies, since deepnet-to-detect + filtering is still a strong baseline and there are still challenges to be solved.

Occlusions, short-lived tracks, misassociations, low frame rate + high-rate-of-change features (e.g. flashing lights) are all still very challenging when you get down to brass tacks.

ska 825 days ago

It's definitely not a solved problem in general, especially in realtime.

It's a lot easier to get started on something interesting and maybe even useful than it was even 10 years ago.

A lot of the "ah we can just use X API" falls apart pretty fast when you do risk analysis on a real system. Lots of these APIs are do a decent job most of the time under somewhat ideal conditions, beyond that things get hairy.

kaibee 825 days ago

> that can pick an object out of an image

You have to do it in real time, from a video feed, and make sure that you're tracking the same unique instance of that object between frames.

lukan 825 days ago

Robots could make a short stop or go slower to process an unclear picture, that is probably not the problem - but the image processing itself, is still way too unreliable. Under ideal condition it mostly works, but have some light fog in the picture or strong sunlight and ... usually all fails.

Otherwise the Teslas would have indeed full self driving mode, using only cameras.

thfuran 825 days ago

>Robots could make a short stop or go slower to process an unclear picture

The costs of doing so are hugely dependent application. It is not, for example, an attractive strategy for an image-guided missile, though it's probably fine for an autonomous vacuum cleaner.

YeGoblynQueenne 825 days ago

And then you need to grasp it.

numpad0 825 days ago

If someone could readily do it using GPT-4V with its apparent sentience, it must be happening already. So far there have been just few demos that shows obvious signs of manual programming, manual remote operation, and/or even VFX editing in some cases.

transitionnel 825 days ago

That language sounds borne of hair-pulling disbelief.

If they can put ImageNet on a SOC, they can do it. [probably too big/watt]

Better yet: ImageNet bones on SOC, cacheable "Immediate Situation" fed by [the obvious logic programming that everyone glances past :) ]

transitionnel 825 days ago

> This is how Cybernetics starts y'all. <

contingencies 826 days ago

Cute quote - added to https://github.com/globalcitizen/taoup :)

I would add supply chain, however.

transitionnel 824 days ago

To solve that:

Assumption: Apple's supply chain is gold standard [~max iterative tech envelope push & max known demand]

Hypothesis: This is swiftly re-creatable for any [max believable & max useful] product. "Detroit, waiting".

jvanderbot 826 days ago

An honor! Pleased to contribute.

DrDroop 825 days ago

What about transformers for robotics, like ALOHA, they seems to help with learning new tasks.