| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by green_man_lives 1136 days ago

This is so cool! I have been wanting to see something like this for a few years now. I tried making a demo of something similar (but much more primitive) in Unity back in 2021, but small transformers weren't good enough at the time.

Is there any way to protect against prompt injection here? Looking at the architecture I am thinking it would be possible for users to tell an agent what to do directly by tricking them.

This isn't really a criticism, I think it's actually a cool feature. It might be a fun premise of a game where you know you are in a simulation and can manipulate the NPCs around you.

2 comments

nullsense 1135 days ago

>Is there any way to protect against prompt injection here? Looking at the architecture I am thinking it would be possible for users to tell an agent what to do directly by tricking them.

Not a solved problem in humans either. I remember watching a documentary on North Korea recently where they covered how his brother got murdered at an airport in Singapore by a woman who had been conned for months into believing she was starring in a reality TV show playing pranks on people.

As generality increases to Infinity I'm not sure there's actually a way to solve this particular problem. It might just be a failure of imagination on my part.

link

oreally 1135 days ago

Got the place wrong, it's an airport in malaysia, not singapore.

link

nullsense 1135 days ago

Right you are!

link

newswasboring 1135 days ago

Please tell me the name of this documentary. That sounds insane!

link

bayesianbot 1135 days ago

Sound like this: https://www.imdb.com/title/tt11394276/