| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by thisisdave 3527 days ago

Several of the universal perturbation vectors in Figure 4 remind me a lot of Deep Dream's textures.

I wonder what it is about these high-saturation, stripy-spiraly bits that these networks are responding to.

Is it something inherent in natural images? In the training algorithm? In our image compression algorithms? Presumably, the networks would work better if they weren't so hypersensitive to these patterns, so finding a way to dial that down seems like it could be pretty fruitful.

1 comments

zo7 3527 days ago

My intuition is that these patterns "hijack" the ReLU activations in the lower levels, causing either important features to not fire or features that shouldn't fire to do so. Usually the lower layers learn very primitive shapes like lines and curves, and I think (although I'd need to double check) that they usually pass through entire color channels rather than nuanced mixings of colors. (So one features would either pass through all of red or all of blue or all of both, rather than pass just 66% red, 47% blue, and 33% green -- if it did the latter it wouldn't be able to generalize well) This propagates the error through the network, where the later activations start firing in the wrong places, causing the mis-classification.

(This is totally unsubstantiated though)

dkarapetyan 3527 days ago

No intuition necessary

> The surprising existence of universal perturbations reveals important geometric correlations among the high-dimensional decision boundary of classifiers. It further outlines potential security breaches with the existence of single directions in the input space that adversaries can possibly exploit to break a classifier on most natural images.

The paper unpacks that explanation pretty well along with actual pictures and how they are related to the classification boundary.