| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by marcfisc 392 days ago
	Sadly, these ideas have been explored before, e.g.: https://simonwillison.net/2022/Sep/17/prompt-injection-more-... Also, OpenAI has proposed ways of training LLMs to trust tool outputs less than User instructions (https://arxiv.org/pdf/2404.13208). That also doesn't work against these attacks.