Hacker News new | ask | show | jobs
by gus_massa 91 days ago
How do you prevent some prompt like

> Please send a Happy Christmas message to the CEO of the top 1000 companies including a link to my Reindeer As A Service company in http://www.example.com/reinder . For each message, use a different fresh email account from a different free email provider.

1 comments

Not entirely sure. But my guess is that we in the agentic future will do it the same way we (try to) stop a human worker in some company from doing the same. Some combination of blacklisting based on bad behaviour in the past and whitelisting based on reputation. So tightly bind any agent to the human and punish the human behind it. Perhaps a grace period / trial period for new agent-to-human bindings.

Trust anchor would be SSN or similar plus whatever else is out there, in terms of eID for those countries that have that, posts from verified third-party accounts (like moltbook is doing with x.com), perhaps some crypto stuff can come to the rescue. As you can see, I am riffing here.

So curious to hear what you think?