Hacker News new | ask | show | jobs
by lIIIllllIIII 1518 days ago
given that AI is primarily trained on web data I wonder if it's possible to attack other people's ML training in that way :-)
1 comments

that's the idea! we know about adversarial inputs at inference time, this paper talks about adversarial perturbation of the model itself during training. what about undetectable adversarial training inputs where people do their own training but the model still ends up with hard to find (except for the adversary) weaknesses?