|
|
|
|
|
by simonw
377 days ago
|
|
The problem is that doesn't work. LLMs cannot distinguish between instructions and data - everything ends up in the same stream of tokens. System prompts are meant to help here - you put your instructions in the system prompt and your data in the regular prompt - but that's not airtight: I've seen plenty of evidence that regular prompts can over-rule system prompts if they try hard enough. This is why prompt injection is called that - it's named after SQL injection, because the flaw is the same: concatenating together trusted and untrusted strings. Unlike SQL injection we don't have an equivalent of correctly escaping or parameterizing strings though, which is why the problem persists. |
|