|
|
|
|
|
by ollien
338 days ago
|
|
You're technically right, but by reducing the problem to being "just" another form of a classic internal XSS, missing the forest for the trees. An XSS mitigation takes a blob of input and converts it into something that we can say with certainty will never execute. With prompt injection mitigation, there is no set of deterministic rules we can apply to a blob of input to make it "not LLM instructions". To this end, it is fundamentally unsafe to feed _any_ untrusted input into an LLM that has access to privileged information. |
|
Everything else—like a "conversation"—is stage-trickery and writing tools to parse the output.