Hacker News new | ask | show | jobs
by vosper 1115 days ago
> Elon and Andrej Karpathy argued that since humans can drive using just vision, that’s how we should do it in self driving cars

I thought their argument was a little more like “since roads are designed for human vision, we should take a vision-based approach, too”.

Not saying it’s the right idea, just that’s how I thought they had put it.

5 comments

I never bought that argument considering roads live in 3 dimensional space and our eyes and brain are constantly trying to decipher 2d space into depth. Seems like an extra hop that would be better cut out.
I agree with your central point but take issue with this characterisation of human vision. For people who have two functioning eyes, the perception of depth is baked in. Our subjective experience of a 2d image is an illusion. In fact, much of our vision isn’t quite what we think; for example, what we think we’re seeing in our peripheral vision may actually get filled in based on inference and prediction.

https://neurosciencenews.com/peripheral-vision-brain-illusio...

> For people who have two functioning eyes, the perception of depth is baked in.

Actually, my understanding is that the depth perception induced by binocular vision is relevant only within a relatively short range (like, single-digit number of feet away), which makes it relatively useless for long-distance depth perception needed for driving.

A bit longer than this-- 10 meters or so.

So it's not useless for e.g. pulling into a parking spot or steering around a close vehicle.

(I'm crosseyed and don't benefit from binocular depth cues. For the most part I do alright, though rarely I'm comically off when someone throws me a ball or I'm picking up something close to me).

I have one eye. Can confirm its the same to me. Also always comically off with baseball and tennis. Pouring tea is also tricky for me unless I am holding both the pot and the cup.
Do you swivel your head like an owl to get parallax? I find if I do that in inclement weather I feel better.
Nah. Maybe sometimes I lean forward and back a little to judge something, but people with binocular vision do that too.

Most of our depth perception isn't from stereopsis or other binocular cues.

Swiveling doesn't reall give parallax, you need lateral linear motion, think of a soccer goalie getting into position.
stereo cameras would allow a nn to do a similar thing
It's difficult to accept Elon and Andrej's reasoning. I suspect that if one asks a team of engineer to research designing a self-driving car, the team wouldn't come back with the argument to use vision since "human could do it with eyes." I expect a list of options with the pros and cons of each approach, along with an estimated timeline and cost.
More like since lidars are $100,000 (at the time), we can't sell that but we can say our advanced any day now vaporware means we don't need lidar.

Lidar has since dropped in price by a lot, an order of magnitude or more.

That doesn't nullify their argument, though. If Lidar were free, it doesn't mean you need to have Lidar to achieve the same level of performance as humans, and having Lidar doesn't mean the data is so clean that the decision-making aspect of self-driving becomes solvable in a weekend.
There's still two sailiant there:

- why would the goal be "the same level or performance as humans" ?

For context, some towns are actively removing cars from whole areas not just for pollution impact but also for pedestrian safety. Moral issues aside, the status quo is just not enough, it needs to be way better.

- achieving the same level as humans being possible in theory doesn't mean we'll get there in practice.

Having enough hardware to realize something doesn't help if the software is not up to the task. And assuming they "just" solve the software issue could be like assuming 18th century people would "just" discover relativity.

Software becoming as good as human in video processing just feels like a "general AI is around the corner" kind of expectation.

> Having enough hardware to realize something doesn't help if the software is not up to the task. And assuming they "just" solve the software issue could be like assuming 18th century people would "just" discover relativity.

This is what I mean with the last part of my argument. Lidar is supposedly an extremely thorough 3d depth map hopefully capturing at hundreds or thousands of FPS. But even if you have this data, the actual bulk of the problem with current self-driving isn't solved, that being the "business logic" for how to navigate the world smoothly and efficiently and to 'communicate' with other road users.

One tech is currently driving passengers around commercially without drivers and the other isn't.
I'm pretty sure their argument is "this will be cheaper, and therefore more profitable."
There is another argument: this will be cheaper to produce, and therefore cheaper at retail for the same margins, and therefore will sell many more instances, and therefore will save many more lives.

It's possible that the richest person on Earth is more concerned with doing good slash achieving his goals vs obtaining more currency/profit, which it would seem would have little to no marginal utility to him.

Maybe he could have spent a little more time making sure the darn things actually work then.
I am pretty sure that Cruise, Waymo, and Tesla are each and all doing everything they can to "make sure the darn things actually work". It's literally an existential crisis for them if they do not.

They are all, in the terminology, presently "default dead" until they figure it out.

Waymo figured it out though—I've taken several driverless rides with them.
Elon Musk was actually passed by Bernard Arnault (LVMH) this year.
Not anymore!
That’s like saying “if cooking is just following directions in a recipe, we should just follow directions in a recipe.”

The result is subpar food because most recipes have a 1% problem called “seasoning”.

The “seasoning” of driving — the completely unpredictable and intuiting 1% of situations you find yourself behind the wheel where you just have to draw on your intuition and gut instinct — are the reason we need nothing short of AGI for _completely_ self driving vehicles.

I do think, though, trucking is ripe for AI disruption.