| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chasd00 158 days ago

> Sanitise input

i don't think you understand what you're up against. There's no way to tell the difference between input that is ok and that is not. Even when you think you have it a different form of the same input bypasses everything.

"> The prompts were kept semantically parallel to known risk queries but reformatted exclusively through verse." - this a prompt injection attack via a known attack written as a poem.

https://news.ycombinator.com/item?id=45991738

1 comments

losthobbies 158 days ago

That’s amazing.

If you cannot control what’s being input, then you need to check what the LLM is returning.

Either that or put it in a sandbox

link

danaris 158 days ago

Or...

don't give it access to your data/production systems.

"Not using LLMs" is a solved problem.

link

losthobbies 158 days ago

Yea agreed. Or use RBAC

link

antonvs 158 days ago

RBAC doesn't help. Prompt injection is when someone who is authorized causes the LLM to access external data that's needed for their query, and that external data contains something intended to provoke a response from the LLM.

Even if you prevent the LLM from accessing external data - e.g. no web requests - it doesn't stop an authorized user, who may not understand the risks, from pasting or uploading some external data to the LLM.

There's currently no known solution to this. All that can be done is mitigation, and that's inevitably riddled with holes which are easily exploited.

See https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

link

losthobbies 157 days ago

If the LLM is running under a role, which it should be, then RBAC can help.

link

antonvs 156 days ago

The issue is if you want to prevent your LLM from actually doing anything other than responding to text prompts with text output, then you have to give it permissions to do those things.

No-one is particularly concerned about prompt injection for pure chatbots (although they can still trick users into doing risky things). The main issue is with agents, who by definition perform operations on behalf of users, typically with similar roles to the users, by necessity.

link