Hacker News new | ask | show | jobs
by hikingsimulator 812 days ago
The is a breadth of literature on the topic. I recommend the excellent survey by Baoyuan wu on the topic (mathematical perspective) [1]. For IRL demonstrations, existing cases will of course be rarer, bu they are not impossible as with attacks on Alpaca-7b [2]

[1] https://arxiv.org/abs/2302.09457 [2] https://poison-llm.github.io/

1 comments

That paper says you need to control "0.1% of the training data size" for a 40% chance for one single injected prompt to fire. So that's millions of images or billions of text tokens for real-world models.
Exactly. It is very difficult to implement these data poisoning attacks in the wild due to the size of internet data in general.
Yeah, but the vibes man.