|
|
|
|
|
by ses425500000
446 days ago
|
|
Yeah, prompt injection is good point. For now, I try separate instruction and data by using JSON format, and run it in sandbox. Not perfect maybe, but I will try add small explanation in README so people can check it better. |
|
- I could change the meaning of the output and the output entirely. - If I can control one part of a larger set of data that is analyzed , I could influence the whole output. - I could try to make the process take forever in order to waste resources.
I'd say the first scenario is most interesting, especially if I could then potentially also influence how an LLM trained on the output behaves and do even more damage using this down the line.
Let's say I'm a disgruntled website author. I want my users to see correct information on my website but don't want any LLM to be trained on it. In this case I could probably successfully use prompt injection to "poison" the model.