Hacker News new | ask | show | jobs
by derivt 3033 days ago
Given that DNN are deep, what a one pixel attack means is that one pixel change propagates through the map of features: one pixel => 0-level-feature change -> one 1-level feature change. So this attack relies in weak features that can easily propagate to next level of features. Hence, to defend against this attack the model should put a threshold on the ratio (sensitivity of features)/(number of pixels) and avoid features with high sensivity to easily propagate to the next level of the DNN. If features are not linearly related to input set, then correlation is not a measure of feature sensitivity and has nothing to say about the full DNN effect of such change in a pixel.