|
|
|
|
|
by sureglymop
444 days ago
|
|
In this case the result/output is plain text. Since it's not code it may be harder to imagine an attack vector. As an attacker, here would be some of my capabilities/possibilities: - I could change the meaning of the output and the output entirely.
- If I can control one part of a larger set of data that is analyzed , I could influence the whole output.
- I could try to make the process take forever in order to waste resources. I'd say the first scenario is most interesting, especially if I could then potentially also influence how an LLM trained on the output behaves and do even more damage using this down the line. Let's say I'm a disgruntled website author. I want my users to see correct information on my website but don't want any LLM to be trained on it. In this case I could
probably successfully use prompt injection to "poison" the model. |
|