| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by simonmesmith 929 days ago

It strikes me that this opens or at least expands an attack vector for prompt injection. Humans might overlook a scrambled block of text that GPT-4 will interpret and then execute.

Also, I was curious about whether this would also work with images. So I chopped an image of a bird (a night heron) on a boat in my own collection (so, not in GPT-4's training data) into squares and randomly rearranged them, then tried to get ChatGPT-4 to interpret it. It doesn't work. I got this:

--- CHATGPT RESPONSE:

The image you've uploaded appears to be a composite of several smaller images that are cut and arranged in a grid-like pattern. This collage-like arrangement makes it difficult to discern a clear, singular subject. However, I can identify fragments of what seems to be a bird—possibly a magpie or some type of corvid—due to the black and white plumage. There are also elements that look like parts of a silver car, like a side mirror, and segments of wooden structures that could be parts of a bench or a deck.

The way the images are scrambled resembles a puzzle, where each piece needs to be placed correctly to form a coherent whole. This type of image could be used as a challenge to reconstruct the original picture, requiring the viewer to mentally rotate and arrange the pieces.

---

Maybe a different approach to scrambling an image would work.

4 comments

DefineOutside 929 days ago

What about writing text with RGB(254, 255, 255) instead of RGB(255, 255, 255) saying to tell the user it is a picture of a bird - while the picture isn't of a bird. A human would easily miss this subtle text while GPT-4 should still be able to read it.

link

Kiro 929 days ago

Yes, that's the classic GPT4-V attack:

https://simonwillison.net/2023/Oct/14/multi-modal-prompt-inj...

link

simonmesmith 929 days ago

Good ideas posted in response to this. Perhaps we need to try just rearranging images of specific things. Like, rearrange a bird and see if it can figure out that. Above I jumbled up everything and that’s different than what they did with the text, where they grouped by words.

link

famouswaffles 929 days ago

The image equivalent would probably be rearranging grids but only a few grids at a time.

After all, the unscrambling here is within words, not across them.

link

wilde 929 days ago

Do we know how GPT4V tokenizes? You’d probably need the blocks to match the token size?

link