| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by whartung 318 days ago

I don’t know how this works, just to start off.

How does the AI bypass the MCP layer to make the request? The assumption is (as I understand it) the AI says “I want to make MCP request XYZ with data ABC” and it sends that off to the MCP interface which does the heavy lifting.

If the MCP interface is doing the schema checks, and tossing errors as appropriate, how is the AI routing around this interface to bypass the schema enforcement?

2 comments

whoknowsidont 318 days ago

>How does the AI bypass the MCP layer to make the request

It doesn't. I don't know why the other commenters are pretending this step does not happen.

There is a prompt that basically tells the LLM to use the generated manifest/configuration files. The LLM still has to not hallucinate in order to properly call the tools with JRPC and properly follow MCP protocol. It then also has to make sense of the structured prompts that define the tools in the MCP manifest/configuration file.

It's system prompts all the way down. Here's a good read of some the underlying/supporting concepts: https://huggingface.co/docs/hugs/en/guides/function-calling

Why this fact is seemingly being lost in this thread, I have no idea, but I don't have anything nice to say about it so I won't :). Other than we're all clearly quite screwed, of course.

MCP is to make things standard for humans, with expected formats. The LLM's really couldn't give a shit and don't have anything super special about how the interact with MCP configuration files or the protocol (other than some additional fine-tuning, again, to make it less likely to get the wrong output).

link

dragonwriter 318 days ago

> There is a prompt that basically tells the LLM to use the generated manifest/configuration files.

No, there isn't. The model doesn't see any difference between MCP-supplied tools, tools built in to the toolchain, and tools supplied by any other method. The prompt simply provides tool names, arguments, and response types to the model. The toolchain, a conventional deterministic program, reads the model response, finds things that meet the models defined format for tool calls, parses out the call names and arguments, looks up in its own internal list of tools to find matching names and see if they are internal, MCP supplied, or other tools, and routes the calls appropriately, gathers responses, does any validation it is designed to do, then mals the validated results into where the model's prompt template specifies tool results should go, and calls the model again with an new message appended to the previous conversation context containing the tool results.

link

12345hn6789 317 days ago

Do you have any technical diagrams or specs that describe this flow? I've been reading the Lang chain[0] and mcp docs[0] and cannot find this behavior you're proposing anywhere.

[0]- https://langchain-ai.github.io/langgraph/agents/mcp/

[1]- https://docs.anthropic.com/en/docs/mcp

link

whoknowsidont 316 days ago

Because it's about the MCP Host <-> LLM interaction. Not how a vanilla server and client communicate to each other and have done so for the last 5+ decades.

This really is not that hard to understand. The LLM must be "bootstrapped" with tool definitions and it must retain stable enough context to continue to call those tools into the future.

This will fail at some point, with any model. It will pretend to do a tool call, it will simply not do the tool call, or it will attempt to call a tool that does not exist, or any of the above or anything else not listed here. It is a statistical certainty.

I don't know why people are pretending MCP does something to fix this, or that MCP is special in anyway. It won't, and it's not.

Make sure you have a good understanding of the overall model: https://hackteam.io/blog/your-llm-does-not-care-about-mcp/

Then take a look at research like this: https://www.archgw.com/blogs/detecting-hallucinations-in-llm...

link

12345hn6789 316 days ago

Oh, so you're not talking about json validation inside the mcp server, you're talking about the contract between the LLM and the MCP server potentially changing. This is a valid issue the same as other APIs that must be written against, the same as you would with other external API connections. Mcp does not solve this correct, just the same as swagger does not solve it.

As for your comments on LLM pretending to do tool calls, sure. That's not what the original thread comments were discussing. There are ways to mitigate this with proper context and memory management but it is more advanced.

link

whoknowsidont 316 days ago

>That's not what the original thread comments were discussing. There are ways to mitigate this with proper context and memory management but it is more advanced.

That is what the original article is describing, and what the comments misunderstood or purposefully over-simplified, and extends it to being able to trace these issues across a large amount of calls/invocations at scale.

>MCP has none of this richness. No machine-readable contracts beyond basic JSON schemas means you can’t generate type-safe clients or prove to auditors that AI interactions follow specified contracts.

>MCP ignores this completely. Each language implements MCP independently, guaranteeing inconsistencies. Python’s JSON encoder handles Unicode differently than JavaScript’s JSON encoder. Float representation varies. Error propagation is ad hoc. When frontend JavaScript and backend Python interpret MCP messages differently, you get integration nightmares. Third-party tools using different MCP libraries exhibit subtle incompatibilities only under edge cases. Language-specific bugs require expertise in each implementation, rather than knowledge of the protocol.

>Tool invocations can’t be safely retried or load-balanced without understanding their side effects. You can’t horizontally scale MCP servers without complex session affinity. Every request hits the backend even for identical, repeated queries.

Somehow comments confused a server <-> client interaction which has been a non-issue for decades with making the rest of the "call stack" dependable. What leads to that level of confusion, I can only guess it's inexperience and religious zealotry.

It's also worth noting that certain commenters saying I "should" (I'm using this word on purpose) read the spec is also pretty laughable, considering how vague the "protocol" itself is.

>Clients SHOULD validate structured results against this schema.

Have fun with that one. MCP could have at least copied the XML/SOAP process around this and we'd be better off.

Which again, leads back to the articles ultimate premise. MCP does a lot of talking and not a lot of walking, it's pointless at best and is going to lead to A LOT of integration headaches.

link

wlamartin 315 days ago

I don't think people in this thread aren't really confused about MCP. They are confused that you claimed, or at least insinuated that an LLM might skip the schema validation portion of an MCP tool call request/response, which was originally demonstrated via Claude Code. Hopefully you can understand why everyone seems so confused, since that claim doesn't make any sense when the LLM doesn't really have anything to do with schema validation at all.

link

cle 318 days ago

What you described is essentially how it works. The LLM has no control over how the inputs & outputs are validated, nor in how the result is fed back into it.

The MCP interface (Claude Code in this case) is doing the schema checks. Claude Code will refuse to provide the result to the LLM if it does not pass the schema check, and the LLM has no control over that.

link

whoknowsidont 318 days ago

>The LLM has no control over how the inputs & outputs are validated

Which is completely fucking irrelevant to what everyone else is discussing.

link

dragonwriter 317 days ago

> > The LLM has no control over how the inputs & outputs are validated

> Which is completely fucking irrelevant to what everyone else is discussing.

Not sure what you think is going on, but that is literally the question this subthread is debating, starting with an exchange in which the salient claims were:

From: https://news.ycombinator.com/item?id=44849695

> Claude Code validated the response against the schema and did not pass the response to the LLM.

From: https://news.ycombinator.com/item?id=44850894

> This time.

> Can you guarantee it will validate it every time ?

link