| > they only incur a cost to the client, and only impact the user context, if/when they're invoked. Look it up. Look up the cross server injection examples. I guarantee you this is not true. An MCP server is at it's heart some 'thing' that provides a set of 'tools' that an LLM can invoke. This is done by adding a 'tool definition'. A 'tool definition' is content that goes into the LLM prompt. That's how it works. How do you imagine an LLM can decide to use a tool? It's only possible if the tool definition is in the prompt. The API may hide this, but I guarantee you this is how it works. Putting an arbitrary amount of 3rd party content into your prompts has a direct tangible impact on LLM performance (and cost). The more MCP servers you enable the more you pollute your prompt with tool definitions, and, I assure you, the worse the results are as a result. Just like pouring any large amount of unrelated crap into your system prompt does. At a small scale, it's ok; but as you scale up, the LLM performance goes down. Here's some background reading for you: https://github.com/invariantlabs-ai/mcp-injection-experiment... https://docs.anthropic.com/en/docs/build-with-claude/tool-us... |
Because yes, for the LLM to find the MCP servers it needs that info on its prompt. And the software is currently hiding how that information is being exposed. Is it prepended to your own message? Does it put it at the start of the entire context? If yes, wouldn’t real-time changes in tool availability invalidate the entire context? So then does it add it to end of the context window instead?
Like nobody really has this dialed in completely. Somebody needs to make a LLM “front end” that is the raw de-tokenized input and output. Don’t even attempt to structure it. Give me the input blob and output blob.
… I dunno. I wish these tools had ways to do more precise context editing. And more visibility. It would help make more informed choices on what to prompt the model with.
/Ramble mode off.
But slightly more serious; what is the token cost for a MCP tool? Like the llm needs its name, a description, parameters… so maybe like 100 tokens max per tool? It’s not a lot but it isn’t nothing either.