|
|
|
|
|
by amenhotep
921 days ago
|
|
This is a guess (I have read some of the paper but to me it does not seem to explain), but presumably you work backwards from the result you want. There is some vector that comes out of the classifier model that would represent cat, and another that would represent dog; you get the difference between these two models, and work back through the layers of matrix multiplication finding places where small perturbations in the input data force the classifier to make the exact errors you need. |
|
OTOH, given this group's track record with their earlier poisoning effort (Glaze) vs. the adulatory press that they managed to arrange, I don't expect much from Nightshade despite the similarly adulatory press. Great media relations game, though.