|
|
|
|
|
by kiitos
430 days ago
|
|
You're totally right, in that: whatever MCP servers your client is configured to know about, have a set of capabilities, each of which have some kind of definition, all of which need to be provided to the LLM, somehow, in order to be usable. And you're totally right that the LLM is usually general-purpose, so the MCP details aren't trained or baked-in, and need to be provided by the client. And those details probably gonna eat up some tokens for sure. But they don't necessarily need to be included with every request! Interactions with LLMs aren't stateless request/response, they're session-based. And you generally send over metadata like what we're discussing here, or user-defined preferences/memory, or etc., as part of session initialization. This stuff isn't really part of a "prompt" at least as that concept is commonly understood. |
|
There is the prompt that I, as a user, send to OpenAI which then gets used. There there is "prompt" which is being sent to the LLM. I don't know how these things are talked about internally at the company. But they take the "prompt" you send them and add a bunch of extra stuff to it. For example, they add in their own system message and they will add your system message. So you end up with something like <OpenAI system message> + <User system message> + <user prompt>. That creates a "final prompt" that gets sent to the LLM. I'm sure we both agree on that.
With MCP, we are also adding in <tool description> to that final prompt. Again, it seems we are agreed on that.
So the final piece of the argument is, as that "final prompt" (or whatever is the correct term) is growing. It is the size of the provider system prompt, plus the size of the user system prompt, plus the size of the tool description, plus the size of the actual user prompt. You have to pay that "final prompt" cost for each and every request you make.
If the size of the "final prompt" affects the performance of the LLM, such that very large "final prompt" sizes adversely affect performance, than it stands to reason that adding many tool definitions to a request will eventually degrade the LLM performance.