| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by boilerupnc 1907 days ago

[Disclosure: I'm an IBMer - not involved with this work]

With regard to exploitation, IBM research has done some interesting work in the form of an open source "Adversarial Robustness Toolbox" [0]. "The open source Adversarial Robustness Toolbox provides tools that enable developers and researchers to evaluate and defend machine learning models and applications against the adversarial threats of evasion, poisoning, extraction, and inference."

It's fascinating to think through how to design the 2nd and 3rd order side-effects using targeted data poisoning to achieve a specific outcome. Interestingly, poisoning could be to force a specific outcome for a one-time gain (e.g. feed data in a way to ultimately trigger an action that elicits some gain/harm) or to alter the outcomes over a longer time horizon (e.g. Teach the bot to behave in a socially unacceptable way)

[0] https://art360.mybluemix.net/