|
|
|
|
|
by gregatragenet3
668 days ago
|
|
Great, I would love to get some of the prompts you have in mind and try them with my library and see the results. Do you have recommendations on more effective alternatives to prevent prompt attacks? I don't believe we should just throw up our hands and do nothing. No solution will be perfect, but we should strive to a solution that's better than doing nothing. |
|
I wish I did! I’ve been trying to find good options for nearly two years now.
My current opinion is that prompt injections remain unsolved, and you should design software under the assumption that anyone who can inject more than a sentence or two of tokens into your prompt can gain total control of what comes back in the response.
So the best approach is to limit the blast radius for if something goes wrong: https://simonwillison.net/2023/Dec/20/mitigate-prompt-inject...
“No solution will be perfect, but we should strive to a solution that's better than doing nothing.”
I disagree with that. We need a perfect solution because this is a security vulnerability, with adversarial attackers trying to exploit it.
If we patched SQL injection vulnerability with something that only worked 99% of the time all of our systems would be hacked to pieces!
A solution that isn’t perfect will give people a false sense of security, and will result in them designing and deploying systems that are inherently insecure and cannot be fixed.