Hacker News new | ask | show | jobs
by swordswinger12 3251 days ago
This paper is a fairly convincing counter-argument to another recent work (https://arxiv.org/abs/1707.03501) on physical adversarial examples for autonomous vehicles. The other paper argued there was "no need to worry" about such physical attacks.
3 comments

In Design of Future Things, ex-Apple UX designer Don Norman interviews (anonymous) engineers of a self-driving car:

DN: "It sped up when we left the freeway instead of slowing down on the exit, that was dangerous. Why did that happen?"

ENG: "Car saw it was on a straightaway. We'll add a rule to handle that."

DN: "So you have to add rules for every possible situation? Doesn't that mean that the car is always at risk for what it doesn't know about yet?"

ENG: "That's Not-A-Problem. We will classify everything."

That's an incredibly stupid answer. It is precisely this kind of thinking that makes me worry about sharing the road with alpha grade self driving hardware. There is real potential for carnage here, at highway speeds it doesn't take that big of a software bug to get a lot of people killed.
It suffers from the same weakness as most of the other papers I've seen about physical attacks where they interpret a change in most likely class as evidence that this will work for a real classifier. They say "We observe that all such baseline images lead to correct classification in all experiments." and then state that the average predicted target class probability was 0.8 +- 0.1 which suggests to me that they are taking a network which doesn't really know whether the sign is a stop sign but guesses that it is and tricking it into not really knowing what the sign is but guessing that it is a speed limit sign. Presumably a real world system would have much more confidence in the true class for the clean images.
So, as a preliminary, you can conduct these kind of attacks against humans as well. It just often isn't as subtle a change to the sign.

Among other things, though, these types of attacks make a lot of assumptions. For example, they assume the only input to "what is that" is a classifier that looks at the image.

Given simple data and previous classification of the image, for example, one can easily determine "hey does it make sense for an added lane sign to appear at a 4 way intersection with no apparent added lane".

Heck, you don't even need to go that far. Given previous, before vandalism classification of the sign, and no change in any terrain/mapping data, ...

So yes, i'd pretty much say "there is no need to worry about such physical attacks", as long as you are not directly hooking up an image classifier to the steering wheel. The likelihood that this ends up a major problem for self driving cars seems pretty low.

Optical illusions come to mind.