Hacker News new | ask | show | jobs
by derivt 3033 days ago
This kind of attack relies on a low margin DNN, see (1), a low spectral norm of the input-output jacobian matrix guarantees good generalization error. So a one pixel attack exploits a weak eigenvalue (small absolute value) of the jacobian matrix.

So to create a one pixel attack, compute: 1)the eigenvalues of the jacobian of input-ouput matrix, 2) takes the the smaller eigenvalue lambda_1 3) compute or approximate the function lambda_1 = f(input) 4) compute j = argmax_{i=1..n} d(lambda_1)/d(input_i) at the point in which the spectral norm is maximum.

So to create the attack change the j-pixel in the points of the training set that has maximum (or high) jacobian matrix.

(1) https://arxiv.org/pdf/1605.08254.pdf