Hacker News new | ask | show | jobs
by ma2rten 3455 days ago
I feel like people keep underestimating how difficult it is to build self-driving hardware and software. You can build a prototype in a couple of month with a small team of people as comma.ai showed, but getting from there to something that is at least as reliable as a human driver is still an open problem.

Building the car itself, creating a ridesharing network, finding parters, swaying consumer perception and even regulatory challenges are going to be walk in park in comparison.

Google/Waymo is well positioned to be the first to solve this problem, since they have been working on this for a decade and because Google has a big advantage in Machine Learning and Computer Vision research. On the other hand Tesla already has cars on the road collecting data.

2 comments

> On the other hand Tesla already has cars on the road collecting data.

but not lidar data, which is why I agree with your statement that google is well-positioned. lidar is expensive now, but I expect self driving cars to be the reason that changes.

How important is lidar? Is it some kind of a silver bullet for self driving cars?
You need both visible cameras and LIDAR (and RADAR) currently.

Cameras give you rich intensity information: you can easily identify road markings, signs, etc provided the scene is well illuminated. Depth from stereo is OK, but struggles if the scene is featureless. Cameras perform about as well as your eyes in inclement weather. The idea that cameras don't work at night isn't entirely true, because your car should have headlights.

LIDAR is used for robust distance measurement. You can make spot measurements on pretty much any non-specular (shiny) surface. It's active so it works without external illumination (so you get 360 3D vision at night, not just where your lights point), it's accurate enough for driving (cm-level at 10s of metres) and by paralleling sensors you can get realtime performance. The Velodyne system uses 64 rx/tx pairs for ~1mpt/s. In practice you get around 20k LIDAR points per camera image because those 1 million points are spread over a hemisphere and your camera is imaging at 30 or 60 fps.

Tesla seem to think they'll be OK with radar and cameras, and not lidar - that's the current hardware revision.
True, probably should be an and/or, but almost certainly you'll have a better time using both - RADAR's spatial resolution isn't nearly as good. Also probably because if they tried to put a LIDAR rig into a consumer car it'd push the price considerably. Unless they could magically get the price down (as Waymo claim) it would be a healthy fraction of the cost of the car. The Velodyne x64 unit cost about $75k not too long ago.

You can do weird tricks with RADAR though, like the two-car lookahead thing that Tesla has implemented. You can do that with LIDAR if you're looking at a mirror (you'll measure the distance to thing in the mirror), but typically multipath returns don't have a high enough SNR to be useful.

When does one ever drive in a featureless place? Are you talking about an empty street covered in snow? Or do you have any other situation in mind?

If a car can self-drive anywhere, but will refuse to take control in an empty street covered in snow, I guess that's not a fatal flaw.

Street surfaces are relatively featureless (uniform grey) at low resolution or quality and a lot of matching algorithms will fail on them. With modern high resolution cameras and algorithms like SGM (see Heiko Hirschmuller's work) you can do pretty well nowadays, but it's not a panacea.

It doesn't necessarily need to be absolutely featureless: more specifically stereo matching suffers when there is local texture that is not sufficiently unique along an epipolar line (normally we use rectified images so epipolar = along the image rows). For a concrete example, if I showed you (or a computer) a small patch of road (< 15x15 px square) and told you to find its location in another image, you would struggle because of the ambiguity. This happens all the time 'in the wild'. Cars are shiny, which means specular reflections everywhere; global illumination differences are fine, but local differences cause problems. Matching surfaces like glass is also hard. Someone else mentioned the sides of artic lorries.

LIDAR avoids a lot of potential confusion, but I'm not suggesting that it's a catch-all. It's time consuming to scan and the data are sparse. The best systems (should) fuse data from all the different sources to maximise confidence.

A light colored semi trailer against a light colored sky has already been a fatal flaw.
It probably would have prevented the Tesla crash with the semi that cameras couldn't see due to low contrast with the sky. So I'd say pretty important. I'm actually shocked that Tesla has done so well without it.
At this point, LIDAR is still better than other methods for sensing and localization. However, it's quite expensive still, which is why doing sensing with cameras only has been a hot research topic.

You won't get by on LIDAR alone though. You'll need cameras for stuff like identifying road lines and signage.

from my understanding lidar is an improvement over cameras because you can use them even if there's no light for the cameras, as well as providing 3D data.
How about someone solves the way-easier problem of boring highway driving first?

I don't understand why the full monty has to be in version 1.0. Give me a good automation on the low hanging fruit, and let convergent evolution of society, infrastructure, and technology iterate to the more difficult use cases.

That is what Tesla and other car manufacturers are doing essentially, isn't it? However, this will always require a human to be paying attention. Highway driving is boring until it suddenly isn't. Driving perfectly on a highway is just as difficult as driving perfectly in a city.

As automation becomes better humans will be paying less and less attention to the road, making this technology somewhat dangerous in the interim. There has already been a fatal accident involving a Tesla on autopilot where the driver was watching Harry Potter.

It is my belief that the boring highway situations decay in the following paths:

* Predictable termination. The user must affirm that they are ready to receive the car. :: User takes over.

* Operator emergency. The user is unresponsive or active indications are that they require emergency assistance. :: auto-park in the nearest assistance area, (video monitored), emergency response crews also en-route.

* Unpredictable catastrophic failure. Internal/External doesn't matter. The vehicle is no longer responsive to the global mesh and a timeout condition occurs. Emergency response is dispatched to the area of incident automatically.

The 'driver taking over' in your scenario would be that final case, however most of those incidents could either be detected and minor failures avoided/tolerated or binned in to the second category. Any other cases are extreme and would result in a cascade anyway, likely even if the human were paying attention as normal.

Thus, the low hanging fruit of travel on the freeway IS the low hanging fruit. Developing a space reserved for automated vehicles and preferably app assisted ride-sharing/carpooling.

Good analysis, but I think we should call out the "predictable catastrophic failures."

Humans can see the scenarios below coming from enough distance to mitigate the problem and AIs have not yet demonstrated equivalence:

1. Slow/erratic person or vehicle suddenly veers into the path of travel. AIs so far tend to just be ultra-safe here and stay far away, which is not the same thing as understanding the situation.

2. Construction zone path of travel is suddenly obstructed. (I'm picking a major scenario to exemplify a class of scenarios where humans surpass AIs still in predicting their environment.)

3. A vehicle ahead performs an emergency maneuver, which communicates unseen/undetected driving hazards. A human can reason from what the vehicle did to what the hazard might be, and immediately begin mitigating the hazard. AIs have not been very forthcoming with detailed behavior descriptions, but they all appear to still be mainly designed as advanced control loops (that is to say, mostly stateless). And yes, it's going to be a legal morass when AIs begin to follow "rules of behavior," since there will always be exceptions.

But this accident was due to a "bug" in the model S' sensor suite and software, and would reportedly not happen with a Lidar. I actually trust a self-driving Google car on a highway more than myself when it's 2AM in the morning, or in a fog.
I think the idea is that you can start by having complete automation (without requiring driver attention) on highways, before enabling it everywhere else. Tesla's current autopilot is neither.