Hacker News new | ask | show | jobs
by mrkeen 645 days ago
What software breakthroughs have they made that make you believe that FSD won't be 5 years away for the rest of time?
2 comments

Not sure - but Waymo and a few Chinese vendors already have self-driving robotaxis in production. That, combined with the AI boom, tells me that building a self-driving car (even if geofenced and/or L3) is on the horizon.

If that's the case, Tesla will probably figure this out as well.

"Traditional" self-driving stacks use a ton of pre-built maps, lidar and other range sensors, and has teams of people keeping those maps registered and up to date.

Tesla's plan is (or has become) to do an end-run around all that, and just train a giant network on camera-only sensor stacks, so that it can navigate without large 3D representations of the environment / city in which it works, without expensive lidar/radar sensor suites, and to skip the "partner" phase that Waymo and others do with particular cities.

This allowed them to bring me, a MN customer, something like lvl 3 autonomy before any other company did. But it might not have the same upper-bound as other, more fine-tuned approaches do, and having ridden in Waymo, Nuro, etc vs my own Tesla, I can tell you the Tesla is wonkier for it. Time will tell.

> something like lvl 3 autonomy before any other company did

I'm quite sure Mercedes-Benz was the first to bring lvl 3 autonomy on the market.

https://arstechnica.com/cars/2023/09/mercedes-benzs-level-3-...

It is also the only carmaker confident enough in the system that it takes full liability over it

> Confidence in Drive Pilot is high within Mercedes-Benz, as the system has been active in Germany for over a year without incident. That confidence is demonstrated by Mercedes’ decision to assume liability for the vehicle while Drive Pilot is in use. That’s a particularly bold move since no other manufacturer offers that kind of assurance.

According to that article Mercedes-Benz's system is exclusively highway driving. Technically level three, but not "full self driving" as most people would understand it, or as Tesla defines the term.
Level 3 autonomous driving is not FSD.
SAE doesn't have a definition for "full self driving", only levels of autonomy. "FSD" is term Tesla came up with to distinguish from their previous level 2 autopilot system which could only do highway driving, whereas "full" level 2 self driving can operate under all normal conditions, including city driving. FSD could theoretically cover levels 2, 3, 4, or 5. Highway-only could be levels 1, 2, 3, or 4. There's a lot of overlap.
Mine was a personal example, not a market analysis :)

I'm quite confident that lvl3 autonomy is becoming widespread, regardless.

End users don't care about the tech details as long as it works - and it does, so Tesla might start sweating about Waymo eating their lunches. Maybe they'll also move towards the mapping approach, which would mean they'd have to have maps constantly updated. That'd mean recurring costs for them.

Besides, I'm pretty sure some degree of mapping is necessary - I know some seriously wonky roads with poor visibility, tons of shoulder lanes, roundabouts, and stop-and-go traffic, where I need to know which lane to get in half a kilometer before the turn comes up.

Most people can't figure it out at the first glance - I usually see a couple trying and failing every day.

A slight correction, in a recent interview Karpathy (ex Tesla AI research / engineer) clarified that Tesla uses additional sensors during training but deployment only uses cameras.
I'm no expert, but anyone can look at the sensors placed on a Waymo compared to a Tesla and likely rightly assume Waymo doesn't have all of that extra stuff just for looks.

I don't see how Tesla is even a serious contender.

I think Tesla assumes that having more and better sensors doesn't really solve the hard problems - just like having more cooling in your computer allows for more performance, but its the die shrinks that get you the real scaling.

It's possible that there exists some error metric inside Tesla that consistently goes down with more training and bigger neural nets in their Vision FSD - whereas switching to LIDAR would reduce that error by a fixed 30%.

They just assume that vision will eventually work out.

While I understand your argument and agree that more sensor means easier path to self-driving, our world is now full of systems that can infer amazing amounts of data from a few sensors. This is the progress of technology.

Apple Watch is probably one of the greatest examples. So many of it's features are inferred via "basic" sensors.

On a different angle, sports refereeing is largely becoming possible due to advances in camera based analysis. We can turn 2d images into a nearly centimeter accurate representation of a playing field in seconds.

> On a different angle, sports refereeing is largely becoming possible due to advances in camera based analysis. We can turn 2d images into a nearly centimeter accurate representation of a playing field in seconds.

These cameras are in a very different and much less dynamic environment than on a road speeding at 100+ km/h while getting splashed on, shat on, dusted on, muddied, stroke by bugs, snowed, etc.

I have a hard time believing that stereoscopic image analysis will ever surpass the efficiency of lidar map analysis of 3d spaces. Given how hard self driving is, it would make sense to make the ugliest, sensor-packed vehicle a working model, and then miniaturize /prune from there.

Starting with "basic" sensors is backwards. It is like aspiring to become a chess grandmaster so good you can play with your eyes closed, and starting out as a beginner with your eyes closed.

The hypothesis was that LIDAR was a crutch, humans manage with just vision.

Whether this is correct for delivering self driving cars, we will find out soon enough. Long term though, it definitely makes sense. We just don't know what the missing pieces of the puzzle are.

> humans manage with just vision

this is commonly repeated but very obviously untrue.

We don't only have vision. We have a general intelligence, coupled with vision. In the absence of AGI, the base assumption has to be the sensor apparatus needs to be significantly superior to humans for an FSD system to drive at a comparable level.

Not to mention it is also untrue because we use senses other than just vision when we drive. We use our ears for acceleration information, sometimes hearing, and the feeling of the wheel when we drive.
We don't have anything close to LIDAR though
You are right.

Would you agree then, that if the goal was to develop AGI, just relying on vision is a credible choice?

No. Why should the design parameters for AGI be limited by what a human can do? If the goal was AGI then I'd want all kinds of additional sensor input that humans don't have.
> humans manage with just vision.

But they don't. I can't see how anyone could look at modern driving and see an optimal state. Driving isn't being managed at all, it's killing droves of humans.

If we put the same restrictions on airplanes (flying by instrument is a crutch), everyone would rightfully find that ridiculous.

They appear to have bet on the wrong technology. The failure happened back in the design phase.

Aircrafts often don't have vision at all, in regular operation.

If a driver doesn't have vision, the right decision is to figure out how to safely stop.

Human vision has a dynamic range of roughly ~21 stops, plus other differences, do we have any cameras that come close to the human eyes "specs"?
The missing piece may be a different mode of transport like trains. Humans adapted from creatures that lived in trees over millions of years, a computer has nothing on that evolutionary process of the bad tree jumper getting eaten or breaking a leg and dying.

Spend a few million years programming a computer to swing through trees and they'll probably get something that can drive a car.

We have close to that much in training data in the form of cars driven by humans.

What we lack is (still) the fundamental algorithms to learn from video. Tokenization like LLMs or diffusion are starting to fall short of this goal.

The missing piece is LIDAR.
Great, I hope I can someday be this confident about predicting the future!
I have Tesla FSD, it already regularly drives me point to point with zero interventions.
I don't think anyone disputes that Tesla has a very decent ADAS system, what you're describing. As an individual driver though, you don't have the statistical power or access to the design process to see what would be different about an autonomous system without the possibility of local intervention. It brings some very different requirements.