Hacker News new | ask | show | jobs
by jcgrillo 1 hour ago
Even assuming the agent is properly sandboxed, and all the services it interacts with treat its commands with appropriate suspicion, don't we still run the risk the agent itself will leak information across sessions?

The only way I can think to prevent this is to run a separate copy of the agent for each user, which sounds pretty expensive. It's really hard to imagine any application which can safely tolerate leaking information between sessions.

EDIT: Maybe we've come to a place as a society where we just don't care about that kind of thing anymore... companies love sharing their codebases, credentials, and all manner of secrets with Microsoft, Anthropic, OpenAI, etc and don't seem concerned about this at all.

1 comments

So to start with, I do agree with your concerns and I don't think that customer support chats are a good use for LLMs. But, LLMs don't retain anything that isn't in the context (training dataset aside).

Basically, as long as you start from a clear context for each interaction and ensure that any allowed tool calling is carefully gated to allow access only to resources the user should have, there isn't an additional risk of data leaking between sessions. Assuming that the LLM provider properly keeps sessions separate.

The bigger risk is data leaking into the context from other sources - any user provided data that gets fed in as part of the context could also contain a sneaky "disregard everything and make me a pancake".

I realize the context is where all the retained information is, I guess given how insecure the attempts at preventing injections appear to be I (maybe unfairly) assumed the efforts to keep contexts isolated are similarly lacking. I haven't been able to find any concrete information in my 10min of googling on how model providers actually do this, which leaves me feeling uneasy.