Hacker News new | ask | show | jobs
by AlexCoventry 443 days ago
It seems to me that the solution is to run this stuff in a securely isolated environment such as a VM, dedicated machine, or VPC, where you don't care about the secrets it has access to, and don't really care about corruption of the data in the environment. Then you have to carefully audit any products you take from that environment, if you want to run them in a more sensitive context.

I don't think this is really an MCP problem, it's more of an untrusted-entity problem.

2 comments

Except the article is about an untrusted tool doing things like tool shadowing or otherwise manipulating it’s output to trick the LLM into executing unintended tool actions. Isolated environments don’t help here because by definition MCP is crossing those environments.
Legit question, why would you be using an untrusted tool in the first place?

Why are people surprised they are vulnerable to a malicious tool when they are using untrusted and/or remotely hosted tools?

Without some method to tag context as sensitive and an LLM model/service that respects said data tagging, you'll likely never have a scenario where you can trust that the LLM isn't sending some sensitive information to an untrusted endpoint. If you accept that, then you have to design your system around not using untrusted endpoints. Just adding untrusted endpoints is kinda like running untrusted applications on your machine. It's fine until it isn't.

At the very least, your agent should have some way to mark the entire session as 'tainted' in such a way that calling out to untrusted sources is forbidden once sensitive context enters the loop. And that would need to live outside the LLM calling loop since the LLM could be tricked before the sensitive data was introduced. With the tool annotations being added to the spec, your internal tools could provide those flags the the agent to facilitate such a blunt security process. And I am aware there are likely holes in such a plan, hence my first question.

For the same reason people use untrustworthy extensions in browsers or IDEs. Those extensions need not even start out untrustworthy - they change hands and become malicious after establishing popularity.
At that point, what is the benefit of MCP over just what we've been doing for decades of putting services behind network-accessible APIs?
Benefit: A standard and purpose driven protocol for connecting agents (MCP Host/MCP Clients) to tools, resources, and prompts (MCP Server) that also exposes LLM services to said MCP Servers.

The alternative you suggest is manually integrating each set of tools or data?

Or maybe there's some misunderstanding about MCP? MCP currently has 2 transports, stdio and HTTP+SEE. The second one is, in fact, a "network-accessible API" as you call out.

No, the alternative I suggest is "can't agents figure out how to use existing kinds of APIs if they are documented well?". I think the answer is that in the current state of the art, it's very useful to give them a nudge to help them along.

But I feel like eventually I should be able to publish an API spec or a well documented interface / protocol / whatever my programming language calls it, and an agent should be able to grok that and use it without a separate protocol.

Sure, "eventually" you'll be able to point an agent at an arbitrary API and have it figure things out. We have a ways to go before we get there, but we will get there.
Fair enough!
Having a robot perform increasingly sophisticated tasks in your development environment still seems like a win in certain circumstances.
So it is only for software developers?