Hacker News new | ask | show | jobs
by simonw 1161 days ago
That's exactly my problem.

Yes, it's better. Bet better isn't good enough.

When I'm building secure software, I want to know that a known exploit has been fully mitigated.

None of the software I ship is vulnerable to SQL injections, or XSS attacks, or CSRF - because I understand those vulnerabilities, and take reliable measures against them.

If someone finds an exploit, I can fix it.

With LLMs and prompt injection I don't get that confidence. If someone finds an exploit I can try and patch it with yet more pleading in my prompt, but I'm forever just guessing at what the fixes are. I can never be certain that a new exploit isn't one more layer of cunning natural-language prompting away.

That's a horrible way to build software.

1 comments

I agree, but then again I don’t think prompt injection attacks are as severe as SQLi or XSS attacks. The latter can be disastrous for your application if even one is found, while for prompt injection the worst can happen is that the user will spoil their own user experience when using an LLM-based product. Of course everything depends on the use case and thus in the current stage of LLMs I would not use them in any security-critical applications.
That depends on what additional capabilities and tools you've made available to your LLM.

If you've granted it access to private data or given it the ability to write and execute code - both things people are starting to actually do - it could be very serious indeed: https://simonwillison.net/2023/Apr/14/worst-that-can-happen/