| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by greshake 1166 days ago

TLDR: With these vulnerabilities, we show the following is possible:

- Remote control of chat LLMs

- Persistent compromise across sessions

- Spread injections to other LLMs

- Compromising LLMs with tiny multi-stage payloads

- Leaking/exfiltrating user data

- Automated Social Engineering

- Targeting code completion engines

There is also a repo: https://github.com/greshake/llm-security and another site demonstrating the vulnerability against Bing as a real-world example: https://greshake.github.io/

These issues are not fixed or patched, and apply to most apps or integrations using LLMs. And there is currently no good way to protect against it.

1 comments

srslack 1166 days ago

The webpage context vuln demo against bing is hilarious. I had semantic web browser context via Chrome Debug Protocol and its Full Accessibilty Tree ready a month or two ago but decided not to put it in anything precisely because of prompt injection like this. I don't think these can be tamed in the way they need to be to be productized, especially not in the way big companies want. That's not to say they're useless, though.

You can also hook yourself up to the websocket and see that their solution to similar problems of prompt injection, bad speak, etc. is to revoke output of responses. It'll generate, but it has another model watching, and it'll take over once it detects "bad thing" (and end the conversation totally on the front-end. but it'll still keep generating, till about 20 messages in, and then the confabulation gets to be a bit much and/or the context just disappears and it just keeps responding as if it's the first message, with no context.)

greshake 1166 days ago

Check out my blog where I show even more up-to-date techniques and the insane ways vulnerable applications are being deployed: https://kai-greshake.de/

Here I go through all of the unsafe products (including military LLMs): https://kai-greshake.de/posts/in-escalating-order-of-stupidi...

Here you can add prompt injections to your resume for free to get your dream job: https://kai-greshake.de/posts/inject-my-pdf/