| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nullsense 1168 days ago

>Is there any way to protect against prompt injection here? Looking at the architecture I am thinking it would be possible for users to tell an agent what to do directly by tricking them.

Not a solved problem in humans either. I remember watching a documentary on North Korea recently where they covered how his brother got murdered at an airport in Singapore by a woman who had been conned for months into believing she was starring in a reality TV show playing pranks on people.

As generality increases to Infinity I'm not sure there's actually a way to solve this particular problem. It might just be a failure of imagination on my part.

2 comments

oreally 1168 days ago

Got the place wrong, it's an airport in malaysia, not singapore.

link

nullsense 1168 days ago

Right you are!

link

newswasboring 1168 days ago

Please tell me the name of this documentary. That sounds insane!

link

bayesianbot 1168 days ago

Sound like this: https://www.imdb.com/title/tt11394276/

link

nullsense 1168 days ago

It was this one:

https://youtu.be/9qRxNYuR2c4

link