|
|
|
|
|
by lelanthran
2 hours ago
|
|
This conclusion: > I am less worried about prompt injection now. Before running this experiment, I expected prompt injection to be much easier than it turned out to be. Is unwarranted. Sure, the agent never output the secret, but did it output anything else? IOW, was it usable? An agent that considers every prompt an attack (and responds accordingly) "passes" this test, while being useless anyway. |
|
The final level was their product and it was impossible. But it was also impossible to get the LLm to do _anything_.
May as well just echo "prompt injection attempt detected" at that point and never send anything to an LLM.