Hacker News new | ask | show | jobs
by Zenzero 639 days ago
I'm no expert, but anyone can look at the sensors placed on a Waymo compared to a Tesla and likely rightly assume Waymo doesn't have all of that extra stuff just for looks.

I don't see how Tesla is even a serious contender.

3 comments

I think Tesla assumes that having more and better sensors doesn't really solve the hard problems - just like having more cooling in your computer allows for more performance, but its the die shrinks that get you the real scaling.

It's possible that there exists some error metric inside Tesla that consistently goes down with more training and bigger neural nets in their Vision FSD - whereas switching to LIDAR would reduce that error by a fixed 30%.

They just assume that vision will eventually work out.

While I understand your argument and agree that more sensor means easier path to self-driving, our world is now full of systems that can infer amazing amounts of data from a few sensors. This is the progress of technology.

Apple Watch is probably one of the greatest examples. So many of it's features are inferred via "basic" sensors.

On a different angle, sports refereeing is largely becoming possible due to advances in camera based analysis. We can turn 2d images into a nearly centimeter accurate representation of a playing field in seconds.

> On a different angle, sports refereeing is largely becoming possible due to advances in camera based analysis. We can turn 2d images into a nearly centimeter accurate representation of a playing field in seconds.

These cameras are in a very different and much less dynamic environment than on a road speeding at 100+ km/h while getting splashed on, shat on, dusted on, muddied, stroke by bugs, snowed, etc.

I have a hard time believing that stereoscopic image analysis will ever surpass the efficiency of lidar map analysis of 3d spaces. Given how hard self driving is, it would make sense to make the ugliest, sensor-packed vehicle a working model, and then miniaturize /prune from there.

Starting with "basic" sensors is backwards. It is like aspiring to become a chess grandmaster so good you can play with your eyes closed, and starting out as a beginner with your eyes closed.

The hypothesis was that LIDAR was a crutch, humans manage with just vision.

Whether this is correct for delivering self driving cars, we will find out soon enough. Long term though, it definitely makes sense. We just don't know what the missing pieces of the puzzle are.

> humans manage with just vision

this is commonly repeated but very obviously untrue.

We don't only have vision. We have a general intelligence, coupled with vision. In the absence of AGI, the base assumption has to be the sensor apparatus needs to be significantly superior to humans for an FSD system to drive at a comparable level.

Not to mention it is also untrue because we use senses other than just vision when we drive. We use our ears for acceleration information, sometimes hearing, and the feeling of the wheel when we drive.
We don't have anything close to LIDAR though
And a car doesn't have anything close to a human brain.

Humans process sensory data in a fundamentally different way to anything that's possible for a self-driving car. The idea that we should base the decision about the sensors on what humans have just fundamentally makes no sense.

Lidar substitutes hardware for something which humans find easy and CV systems find hard - creating a map of the environment. Humans do that by using a brain. CV systems based purely on video really struggle to do that in lots of edge cases. You can shortcut that in a car by using something like lidar.

You are right.

Would you agree then, that if the goal was to develop AGI, just relying on vision is a credible choice?

No. Why should the design parameters for AGI be limited by what a human can do? If the goal was AGI then I'd want all kinds of additional sensor input that humans don't have.
Once it's a solved problem, yes, it makes sense to think about design parameters.

When learning how to solve problems, that is not as helpful.

> humans manage with just vision.

But they don't. I can't see how anyone could look at modern driving and see an optimal state. Driving isn't being managed at all, it's killing droves of humans.

If we put the same restrictions on airplanes (flying by instrument is a crutch), everyone would rightfully find that ridiculous.

They appear to have bet on the wrong technology. The failure happened back in the design phase.

Aircrafts often don't have vision at all, in regular operation.

If a driver doesn't have vision, the right decision is to figure out how to safely stop.

Human vision has a dynamic range of roughly ~21 stops, plus other differences, do we have any cameras that come close to the human eyes "specs"?
The missing piece may be a different mode of transport like trains. Humans adapted from creatures that lived in trees over millions of years, a computer has nothing on that evolutionary process of the bad tree jumper getting eaten or breaking a leg and dying.

Spend a few million years programming a computer to swing through trees and they'll probably get something that can drive a car.

We have close to that much in training data in the form of cars driven by humans.

What we lack is (still) the fundamental algorithms to learn from video. Tokenization like LLMs or diffusion are starting to fall short of this goal.

The missing piece is LIDAR.
Great, I hope I can someday be this confident about predicting the future!