| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 979 days ago
	That's why I always emphasize that prompt injection isn't an attack against LLMs themselves: its a class of attacks against applications we build on top of LLMs that work by concatenating together trusted and untrusted prompts.

2 comments

thaanpaa 979 days ago

Isn't that just shifting the user's misunderstanding to whoever is developing the application?

I guess my argument is that if the type of behaviour described in the article causes problems, perhaps the technology was chosen incorrectly.

Edit: Or maybe I just have a problem with the vocabulary. Obviously, it's useful information.

link

roywiggins 979 days ago

It's a bit weird that they can't even avoid this when it comes to images; GPT shouldn't really be obeying instructions from images at all! I wonder if it's just OCRing images and concatenating that into the prompt...

link

simonw 979 days ago

It's much more sophisticated than just OCR. The model was trained on images and text at the same time - it isn't processing images in a separate step.

The GPT-4 paper has a bunch more about this.

link

thaanpaa 979 days ago

Not really, I suppose; it's just a different type of prompt. The algorithm does not "know" what it is fed. Data is data.

link