Hacker News new | ask | show | jobs
by Intermernet 102 days ago
I may be showing my ignorance here, but wouldn't the ideal situation be for the service to use the same number of tokens no matter what client sent the query?

If the service is using more tokens to produce the same output from the same query, but over a different protocol, than the service is a scam.

2 comments

If you intercept what an agent (client) sends to the LLM with multiple MCP servers and tools, the context or header is filled with available MCP servers and all tools as part of the conversation.

With a CLI, you avoid sending this context to the LLM and it progressively discovers only what is needed.

The input token costs come down because of using a CLI instead of MCP

When you're using an agent, the "query" isn't just each bit of text you enter into the agent prompt. It's the whole conversation.

But I do wonder about these tools whether they have tested that the quality of subsequent responses is the same.

That doesn't explain why the protocol matters. Surely for equivalent responses, you need to send equivalent payloads. You shouldn't be able to hack this from the client side.