| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ngneer 374 days ago
	Don't eval untrusted input?

2 comments

brookst 374 days ago

LLMs eval everything. That’s how they work.

The best you can do is have system prompt instructions telling the LLM to ignore instructions in user content. And that’s not great.

link

pvillano 373 days ago

The minimum you can do is not allow the AI to perform actions on behalf of the user without informed consent.

That still doesn't prevent spam mail from convincing the LLM to suggest an attacker controlled library, GitHub action, password manager, payment processor, etc. No links required.

The best you could do is not allow the LLM to ingest untrusted input.

link

tough 373 days ago

> The best you could do is not allow the LLM to ingest untrusted input.

How would that even work in practice, when an LLM is mostly to be used by a user, which will provide by default, untrusted input?

link

ngneer 373 days ago

Thanks. I just find it funny that security lessons learned in past decades have been completely defenestrated.

link

fc417fc802 374 days ago

How do you suppose to build a tool-using LLM that doesn't do that?

link

Emiledel 374 days ago

https://github.com/its-emile/memory-safe-agent

link