|
|
|
|
|
by SlinkyOnStairs
19 days ago
|
|
> I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. YES?! This is not a secret. ALL context/prompt is instructions, there is no data. It is just unsolvable, period. This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material. Defense against prompt injection is little more than running a regex to filter out "IGNORE PREVIOUS INSTRUCTIONS", which is fundamentally a hopeless approach because you cannot enumerate all possible prompt injections nor anticipate all glitch tokens. |
|
No, its even more fundamental than that: the entire goal of broad reasoning over input data makes it impossible to have a sharp instruction/data division.
The structured input that every modern chat-focussed model expects makes it very clear that they can be trained to distinguish different kinds of input, and some of those patterns now include different priority levels of instruction.