Hacker News new | ask | show | jobs
by amelius 2846 days ago
The problem is that deep learning should not be allowed in safety critical systems, because (1) the accuracy is always less than 100% even in known test situations, and (2) we don't know how it works and under what conditions it breaks down.
7 comments

I think it should be allowed but the tests it should pass must be far more strenuous than for traditional software. I'm happy with failure rates of around 1 catastrophic failure every million hours. Even humans sometimes fail catastrophically and black out at the wheel for no detectable reason.

That level of testing is well beyond what today's software and hardware is capable of. Waymo has to override their cars (disengagement) approximately every 6000 miles [1] which equates to about 200 hours of driving. To reach a confidence level of 1 in a million hours you would need a test fleet of a thousand vehicles operating for a whole year without any incident occurring that requires human intervention. The costs for such testing would run into the hundreds of millions of dollars which makes me feel like only the largest corporations in the world could develop this technology.

[1] https://www.dmv.ca.gov/portal/dmv/detail/vr/autonomous/disen...

And after certification, what if the company wants to push a quick update to all of its cars every now and then, through a remote update? Would that be allowed? How would we even know that it happens?
> And after certification, what if the company wants to push a quick update to all of its cars every now and then, through a remote update?

Late and a bit rough but here is my idea:

Install redundant self driving units in at least a good number number of the first few thousand cars in each generation.

When planning a release, push to the redundant unit in the cars already running in "production".

Use only primary unit as input to car as usual, but log the diffs between the new version and old version in the same way they now log driver intervention.

I think there is a number of issues this won't catch, off the top of my head what if the new self driving unit attempts to turn slightly faster on slippery road etc.

But it should be able to collect up realistic feedback really fast i.e. in a few months (crazy slow for modern application developers like me but more than fast enough for anything that should be allowed to drive unsupervised I guess :-)

I am fascinated by the testing methodology you described (current version and proposed upgrade run in parallel in production). I hope this isn't a dumb question but is there a name for that style of testing and do their exist categories of systems for which it is commonly used?
> I hope this isn't a dumb question

Definitely not : )

> but is there a name for that style of testing and do their exist categories of systems for which it is commonly used

I don't know but I guess parallell deployment, production testing (hehe) or something.

I guess I learned about it here on HN but I've heard similar approaches elsewhere.

Three concrete systems I can think of that has been tested in similar but not identical ways:

- trajectory calculation systems for some space probe (I forgot the details) where two separate vendors where tasked with writing software and their software where then run in parallel in a simulation of possible trajectories to root out any bugs. Mentioned as an example to point out an extreme variant of testing, probably by someone here on HN or in something linked to from HN.

- a vendor running testing of bag sorting equipment on an airport. Probably told me from a colleague of mine at the time who knew them. AFAIK they'd rerun batches and verify the outputs, making sure the new system produced similar or better results.

- an API testing service, probably mentioned here on HN as well, tjat worked by firing three similar requests for each resource and method: two to a deployment of the old system and one to the new. The two that were fired against the old would be used to find parts of the response that changed from request to request (time stamps etc) and the the rest of the response would be used as a template to verify the response from the new system.

more generally capturing and reusing production input or running two sets of services in parallel in a data center, - or across datacenters: the existing system as usual and the system under test with only input data.

Just make update transparency part of the initial certification.

Of course they might cheat, but who really knows if their car has an airbag where it is supposed to.

> Just make update transparency part of the initial certification.

Somehow I doubt the authorities have the foresight.

Even when they do, corporations will attempt to subvert process and policy "interlocks". See: Industry wide emissions cheating scandal.
They do that sort of thing occasionally, but on the whole, over time, cars have improved with regard to emissions and safety, mostly due to regulation.
Perhaps, but it's going to be difficult to justify that logic to a jury in a wrongful death civil lawsuit.
I was in a car crash two years ago where a man went into a diabetic fit/seizure and sped through an intersection, ultimately hitting a building and my car and killing himself in the process. It is too bad his car did not have some of this deep learning that is not 100% accurate.

We don't know how humans work and under what conditions they break down, either.

I'm not in the market for a new car, but from what I read: There is something called "city safety" by Volvo, and I know that Mercedes has the similar tech (a friend learned that by not being run over by a distracted driver). So there are already technologies to prevent (or at least reduce the severity) of what happened to you (assuming he was below a certain speed threshold).

In constrast to the whole self driving stuff this DL is popular for: User input overrides DL input.

There is no evidence that deep learning would give better performance than other collision avoidance algorithms in such a scenario.
But that doesn't mean we shouldn't try. I was agreeing with GP (u/amelius) because I had the same idea when reading the post, but the parent of your comment (u/simonsarris) makes a good point: we might not know deep learning as well as we might like to know it, given that it is being used in applications that have the potential to kill, but we also don't know our own brains that well.

Even if we don't understand deep learning to the degree that we would like, we can observe its safety record and compare it to humans'.

There's a difference between active and passive use of self-driving with the current technology.

Passive self-driving systems that take over when the human gets distracted/unwell are great because human vision exceeds computers where as computers are always alert. This would capture the case you describe, I think it would also have a massive improvement for when bus/lorry drivers should collapse at the wheel (Elon Musk used this as a valid use case for Tesla auto-pilot in the Tesla Semi unveiling).

However active self-driving systems (e.g. Tesla's auto-pilot) are currently worse because they rely on computer vision and humans to be always alert.

Every self driving car company is using neural networks. If neural nets are going to be used anyway, I'd rather have centimeter level range information per pixel than not.

By the way, although the article focuses on deep learning, there are many applications that don't involve deep learning. For example, although you can run the deep neural network based SuperPoint on the intensity data, you can also run any classical feature extraction algorithm such as SIFT, SURF, ORB, BRISK, FAST, AGAST, etc. Doing so provides an elegant solution to the problem of localizing in a geometrically sparse but visually rich environment, such as a smooth but well-illuminated tunnel.

The status quo is neural nets already are allowed in safety critical systems - humans.

And with the amount of drivers snapchatting behind the wheel, I'd rather take my chances with a self-driving car.

> neural nets already are allowed in safety critical systems - humans.

I’d really like to see a demonstration that human behavior is just neural nets. Sadly, I think that’s still an open question.

Nothing you do will prevent a snapchatting driver from crushing into you.At that point I am sure you will observe with interest how a neural network deals with the situation.
You do realize that traditional computer vision doesn't work 100% of the time, right?

What would you rather use to interpret sensor data?

Somewhat related anegdote: in France there one of the option for toll roads is a device that handles a payment for you and opens the barrier as the car approaches (it works up to 30 km/h).

A friend of mine was doing that in a new car with some pedestrian detection system that decided to detect the barrier as a human and slam the brakes to the complete stop. From what I've heard it was not exactly pleasant.

So n=1 and it's not even the same technology.
Delaying the roll out of algorithms if they achieve superhuman performance in avoiding accidents might not be the moral high ground...