Hacker News new | ask | show | jobs
by sagebird 3264 days ago
It doesn't matter how many algorithms or sensors are consulted or combined to form judgment. If an attacker can obtain a self driving vehicle's hardware, and if enough tests can be performed per seconds, the attacker can train images that fool it.

Your idea is similar to an appeal to security through obscurity. Might work sometimes, but not generally.

(Noise does not help, because you can still discover a gradient to descend by averaging repeated trials.)

1 comments

You assume that all sensors are working on 'images' and that the algorithm are all using gradient descent.
Replace 'images' with 'sensor data' and adversarial examples can still be generated. They might not be as easy to feed into the vehicles hardware (e.g. requiring speakers to fool an acoustic sensor), but the same principles apply.

It's also not necessary for the recognition algorithms to be using gradient descent, so long as they are differentiable (or can be approximated by a model that is), you can use gradient descent to find adversarial examples.

Adversarial examples exist for any model with a high input dimension (in relation to the available training data), differentiability only helps with finding them.

That's really interesting though, you might be able to combine something like Random Forest with Neural Nets to make them more robust to adversarial images.