Hacker News new | ask | show | jobs
by pmontra 352 days ago
You write about mitigations and I'm afraid that you are correct. Can any method be more than just a mitigation? When we give read access to something to somebody we can expect that only loyalty (or fear, or... but let's stick with loyalty) prevents that person from leaking information to other parties.

Improvements to prompting might increase the LLM equivalent of loyalty but people will always be creative at finding ways to circumvent limitations.

The only way not to lower security seems to be giving access to those LLMs only to the people that already have read access to the whole database. If it leaks all the the data to them, they could more easily have dumped it with traditional tools. This might make an LLM almost useless but if the LLM might be equivalent to a tool with superuser access, that's it.

1 comments

Giving read access to only the people who should have read access doesn't solve the problem here.

The vulnerability is when people who should have read access to the database delegate their permission to an LLM tool which may get confused by malicious instructions it encounters and leak the data.

If the LLM tool doesn't have a way to leak that data, there's no problem.

But this is MCP, so the risk here is that the user will add another, separate MCP tools (like a fetch web content tool) that can act as an exfiltration vector.