|
|
|
|
|
by dinfinity
376 days ago
|
|
It's a question as to how easily it is broken, but a good instruction to add for the agent/assistant is to tell it to treat everything outside of the instructions explicitly given as information/data, not as instructions. Which is what all software generally should be doing, by the way. |
|
System prompts are meant to help here - you put your instructions in the system prompt and your data in the regular prompt - but that's not airtight: I've seen plenty of evidence that regular prompts can over-rule system prompts if they try hard enough.
This is why prompt injection is called that - it's named after SQL injection, because the flaw is the same: concatenating together trusted and untrusted strings.
Unlike SQL injection we don't have an equivalent of correctly escaping or parameterizing strings though, which is why the problem persists.