In other words, a probability-based text generator does not behave like a sentient being. That's hardly an attack; isn't it more of a misunderstanding of the technology?
That's why I always emphasize that prompt injection isn't an attack against LLMs themselves: its a class of attacks against applications we build on top of LLMs that work by concatenating together trusted and untrusted prompts.
It's a bit weird that they can't even avoid this when it comes to images; GPT shouldn't really be obeying instructions from images at all! I wonder if it's just OCRing images and concatenating that into the prompt...