Hacker News new | ask | show | jobs
by taocoyote 406 days ago
I don't understand the logistics of MCP interactions. Can anyone explain why they aren't stateless. Why does a connection need to be held open?
2 comments

I think some of the advanced features around sampling from the calling LLM could theoretically benefit from a bidirectional stream.

In practice, nobody uses those parts of the protocol (it was overdesigned and hardly any clients support it). The key thing MCP brings right now is a standardized way to discover & invoke tools. This would’ve worked equally well as a plain HTTP-based protocol (certainly for a v1) and it’d have made it 10x easier to implement.

Sampling is to my eyes a very promising aspect of the protocol. Maybe its implementation is lagging behind because it's too far from the previous mental model of tool use. I am also fine if the burden is on the client side if it enables a good DX on server side. In practice, there would be much more servers than clients.
> This would’ve worked equally well as a plain HTTP-based protocol

With plain HTTP you can quite easily "stream" both the request's and the response's body: that's a HTTP/1 feature called "chunking" (the message body is not just one byte array, it's "chunked" so that each chunk can be received in sequence). I really don't get why people think you need WS (or ffs SSE) for "streaming". I've implemented a chat using just good old HTTP/1.1 with chunking. It's actually a perfect use case, so it suits LLMs quite well.

Well, the point is to provide context, it's easier to do if server has state.

For example, you have a MCP client (let's say it's amazon q cli), a you have a MCP server for executing commands over ssh. If connection is maintained between MCP client and server, then MCP server can keep ssh connection alive.

Replace SSH server with anything else that has state - a browser for example (now your AI assistant also can have 500 open tabs)