Hacker News new | ask | show | jobs
by cle 313 days ago
I don't understand this criticism by the author. MCP supports JSON Schema, and server responses must conform to the schema. If the schema requires an ISO-8601 timestamp (ex by specifying a "date" format in the schema) but the server sends a Unix epoch timestamp, then it is violating the protocol.

The author even later says that MCP supports JSON Schema, but also claims "you can't generate type-safe clients". Which is plainly untrue, there exist plenty of JSON Schema code generators.

3 comments

Except that any properly written software will respond to protocol and schema violations by throwing an error.

Claude will happily cast your int into a 2023 Toyota Yaris and keep on hallucinating things.

For the curious:

> Cast an integer into the type of a 2023 Toyota Yaris using Javascript

(GPT-4o mini)

> To cast an integer into the type of a 2023 Toyota Yaris in JavaScript, you would typically create a class or a constructor function that represents the Toyota Yaris. Then, you can create an instance of that class using the integer value. Here's an example of how you might do this:

  // Define a class for the 2023 Toyota Yaris
  class ToyotaYaris {
      constructor(modelYear, value) {
          this.modelYear = modelYear;
          this.value = value;
      }
  
      displayInfo() {
          console.log(`Model Year: ${this.modelYear}, Value: ${this.value}`);
      }
  }
  
  // Function to cast an integer into a Toyota Yaris object
  function castToYaris(integerValue) {
      const modelYear = 2023; // Set the model year
      return new ToyotaYaris(modelYear, integerValue);
  }
  
  // Example usage
  const integerValue = 20000; // Example integer value
  const yaris = castToYaris(integerValue);
  yaris.displayInfo(); // Output: Model Year: 2023, Value: $20000
You really tried to inflict the most amount of damage to the reader by choosing GPT-4o mini
Billy, it's becoming increasingly harder to believe you are writing this code yourself
I just tried this in Claude Code. I made an MCP server whose tool output is declared as an integer but it returns a string at runtime.

Claude Code validated the response against the schema and did not pass the response to the LLM.

     test - test_tool (MCP)(input: "foo")
      ⎿  Error: Output validation error: 'bar' is not of type 'integer'
How many times does this need to be repeated.

It works in this instance. On this run. It is not guaranteed to work next time. There is a error percentage here that makes it _INEVITABLE_ that eventually, with enough executions, the validation will pass when it should fail.

It will choose not to pass this to the validator, at some point in the future. It will create its own validator, at some point in the future. It will simply pretend like it did any of the above, at some point in the future.

This might be fine for your B2B use case. It is not fine for underlying infrastructure for a financial firm or communications.

Every time the LLM uses this tool, the response schema is validated--deterministically. The LLM will never see a non-integer value as output from the tool.
Can you please diagram out, using little text arrows ("->"), what you think is happening so I can just fill in the gap for you?
I write these as part of my job, I know how they work. I'm not going to spend more time explaining to you (and demonstrating!) what is in the spec. Read the spec and let the authors know that they don't understand what they wrote. I've run out of energy in this conversation.
llm tool call -> mcp client validates the schema -> mcp client calls the tool -> mcp server validates the schema -> mcp server responds with the result -> mcp client passes the tool result into llm
This time.

Can you guarantee it will validate it every time ? Can you guarantee the way MCPs/tool calling are implemented (which is already an incredible joke that only python brained developers would inflict upon the world) will always go through the validation layer, are you even sure of what part of Claude handles this validation ? Sure, it didn't cast an int into a Toyota Yaris. Will it cast "70Y074" into one ? Maybe a 2022 one. What if there are embedded parsing rules into a string, will it respect it every time ? What if you use it outside of Claude Code, but just ask nicely through the API, can you guarantee this validation still works ? Or that they won't break it next week ?

The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.

> Can you guarantee it will validate it every time ?

Yes, to the extent you can guarantee the behavior of third party software, you can (which you can't really guarantee no matter what spec the software supposedly implements, so the gaps aren't an MCP issue), because “the app enforces schema compliance before handing the results to the LLM” is deterministic behavior in the traditional app that provides the toolchain that provides the interface between tools (and the user) and the LLM, not non-deterministic behavior driven by the LLM. Hence, “before handing the results to the LLM”.

> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.

The toolchain is parsing, validating, and mapping the data into the format preferred by the chosen models promot template, the LLM has nothing to do with doing that, because that by definition has to happen before it can see the data.

You aren't trusting the LLM.

>The toolchain is parsing, validating, and mapping the data into the format preferred by the chosen models promot template, the LLM has nothing to do with doing that

The LLM has everything to do with that. The LLM is literally choosing to do that. I don't know why this point keeps getting missed or side-stepped.

It WILL, at some point in the future and given enough executions, as a matter of statistical certainty, simply not do that above, or pretend to do the above, or do something totally different at some point in the future.

> The LLM has everything to do with that. The LLM is literally choosing to do that.

No, the LLM doesn't control on a case-by-caae basis what the toolchain does between the LLM putting a tool call request in an output message and the toolchain calling the LLM afterwards.

If the toolchain is programmed to always validate tool responses against the JSON schema provided by MCP server before mapping into the LLM prompt template and calling the LLM again to handle the response, that is going to happen 100% of the time. The LLM doesn't choose it. It CAN'T because the only way it even knows that the data has come back from the tool call is that the toolchain has already done whatever it is programmed to do, ending with mapping the response into a prompt and calling the LLM again.

Even before MCPs or even models specifically trained and with vendor-provided templates for tool calling (but after the ReAct architecture was described), it was like a weekend project to implement a basic framework supporting tooling calling around a local or remote LLM. I don't think you need to do that to understand how silly the claim that the LLM controls what the toolchain does with each response and might make it not validate it is, but certainly doing it will give you a visceral understanding of how silly it is.

I don’t know how this works, just to start off.

How does the AI bypass the MCP layer to make the request? The assumption is (as I understand it) the AI says “I want to make MCP request XYZ with data ABC” and it sends that off to the MCP interface which does the heavy lifting.

If the MCP interface is doing the schema checks, and tossing errors as appropriate, how is the AI routing around this interface to bypass the schema enforcement?

This is deterministic, it is validating the response using a JSON Schema validator and refusing to pass it to an LLM inference.

I can't gaurantee that behavior will remain the same more than any other software. But all this happens before the LLM is even involved.

> The whole point of it is, whichever LLM you're using is already too dumb to not trip when lacing its own shoes. Why you'd trust it to reliably and properly parse input badly described by a terrible format is beyond me.

You are describing why MCP supports JSON Schema. It requires parsing & validating the input using deterministic software, not LLMs.

>This is deterministic, it is validating the response using a JSON Schema validator and refusing to pass it to an LLM inference.

No. It is not. You are still misunderstanding how this works. It is "choosing" to pass this to a validator or some other tool, _for now_. As a matter of pure statistics, it will simply not do this at some point in the future on some run.

It is inevitable.

I'd encourage you to read the MCP specification: https://modelcontextprotocol.io/specification/2025-06-18/ser...

Or write a simple MCP server and a client that uses it. FastMCP is easy: https://gofastmcp.com/getting-started/quickstart

You are quite wrong. The LLM "chooses" to use a tool, but the input (provided by the LLM) is validated with JSON Schema by the server, and the output is validated by the client (Claude Code). The output is not provided back to the LLM if it does not comply with the JSON Schema, instead an error is surfaced.

> . It is "choosing" to pass this to a validator or some other tool, _for now_.

No, its not. The validation happens at the frontend before the LLM sees the response. There is no way for the LLM to choose anything about what happens.

The cool thing about having coded a basic ReAct pattern implementation (before MCP, or even models trained on any specific prompt format for tool calls, was a thing, but none of that impacts the basic pattern) is that it gives a pretty visceral understanding of what is going on here, and all that's changed since is per model standardization of prompt and response patterns on the frontend<->LLM side and, with MCP, of the protocol for interacting on the frontend<->tool side.

Claude Code isn't a pure LLM, it's a regular software program that calls out to an LLM with an API. The LLM is not making any decisions about validation.
As an example.

"1979010112345" is a unix timestamp that looks like it might be Jan 1 1979 datetime formatted as an integer, but is really Sep 17 2032 05:01:52.

This can even be a feature: https://github.com/Zorokee/ArtificialCast
imho it's a fantasy to expect type safe protocols except in the case that both client and server are written in the same (type safe) language. Actually even that doesn't work. What language actually allows a type definition for "ISO-8601 timestamp" that's complete? Everything ends up being some construction of strings and numbers, and it's often not possible to completely describe the set of valid values except by run-time checking, certainly beyond trivial cases like "integer between 0 and 10".
> What language actually allows a type definition for "ISO-8601 timestamp" that's complete?

It is absolutely possible to do this, and to generate client code which complies with ISO-8601 in JS/TS. Large amounts of financial services would not work if this was not the case.

See the c# support for ISO-8601 strings: https://learn.microsoft.com/en-us/dotnet/standard/base-types...

You've misunderstood his statement and proven his point.

`DateTime` is not an ISO-8601 type. It can _parse_ an ISO-8601 formatted string.

And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.

> `DateTime` is not an ISO-8601 type. It can _parse_ an ISO-8601 formatted string.

This is exactly the point: a string is just a data interchange format in the context of a DateTime, and C# provides (as far as I can tell) a complete way of accessing the ISO-8601 specification on the language object. It also supports type-safe generation of clients and client object (or struct) generation from the ISO-8601 string format.

> And even past that, there are Windows-specific idiosyncrasies with how the `DateTime` class implements the parsing of these strings and how it stores the resulting value.

Not really. The windows statements on the article (and I use this on linux for financial services software) are related to automated settings of the preferences for generated strings. All of these may be set within the code itself.

>The windows statements on the article

That was based on decades of experience in .NET and Windows. Not the article ;).

Ah. For what its worth, the only windows machine I have is used pretty much only to run steam. I can't really comment on c# on windows.
Generally you'd use a time library to model ISO-8601 dates in a typesafe way. Some fancier languages might have syntactic support for it, but they ultimately serve the same purpose.

Related but distinct from serialization.

Not only it's possible, but it has been done many times before, and a lot of modern business software is still built on that. You know that old fashioned XML thing? It has schemas, and those schemas come with a lot of predefined types including, specifically, a whole family of ISO 8601-related types:

https://www.w3.org/TR/xmlschema-2/#isoformats

And so you can write a web service in Java and use it from C#, or whatever.

Let's not focus on 100% or nothing. Let's focus on better and worse.
At its core, the article was just ramblings from someone being upset that LLMs didn't make things more complicated so that they could charge more billable hours to solve invented corporate problems... Which some people built their career on.

The merchants of complexity are disappointed. It turns out that even machines don't care for 'machine-readable' formats; even the machines prefer human-readable formats.

The only entities on this planet who appreciate so-called 'machine-readability' are bureaucrats; and they like it for the same reason that they like enterprise acronyms... Literally the opposite of readability.

I look forward to waiting a decade and seeing what MCP ends up reinventing.