|
|
|
|
|
by spacebanana7
444 days ago
|
|
Most LLM users don’t want models to have that level of literalism. My manager would be very upset if they asked me “Can you get this done by Thursday?” and I responded with “Sure thing” - but took no further action, being satisfied that I’d literally fulfilled their request. |
|
However, when people are talking about the "critical flaw" in LLMs, of which this "tool shadowing" attack is an example of, they're talking about how the LLMs cannot differentiate between text that is supposed to give them instructions and text that is supposed to be just for reference.
Concretely, today, ask an LLM "when was Elvis born", something in your MCP stack might be poisoning the LLM content window and causing another MCP tool to leak your SSH keys. I don't think you can argue that the user intended for that.