Hacker News new | ask | show | jobs
by nullsense 1121 days ago
>Is there any way to protect against prompt injection here? Looking at the architecture I am thinking it would be possible for users to tell an agent what to do directly by tricking them.

Not a solved problem in humans either. I remember watching a documentary on North Korea recently where they covered how his brother got murdered at an airport in Singapore by a woman who had been conned for months into believing she was starring in a reality TV show playing pranks on people.

As generality increases to Infinity I'm not sure there's actually a way to solve this particular problem. It might just be a failure of imagination on my part.

2 comments

Got the place wrong, it's an airport in malaysia, not singapore.
Right you are!
Please tell me the name of this documentary. That sounds insane!