Hacker News new | ask | show | jobs
by datenarsch 1678 days ago
> XMPP is fundamentally flawed

how is it fundamentally flawed, can you elaborate?

3 comments

I'm not going claim that it's fundamentally flawed, but here's an anecdote. Many years ago, when XML was having its day in the sun, long before it was sidelined by the simplicity of REST and JSON, I was at a Java One convention listening to a speaker present on some new XML parsing API. After the talk, I approached the presenter for some post-talk Q&A to ask how one might use the API to parse the Jabber protocol, which may or may not be relevant to what XMPP is today (I haven't been keeping up.)

The presenter was unfamiliar with the protocol, so I had to describe how the xml document was opened when you establish a connection, and how elements keep getting appended to it, and how the "xml document" isn't really completed until you're all done and the connection is terminated.

They looked at me like I had two heads.

To them, XML didn't make any sense at all unless you have the entire document available all at once. After all, how on earth could one ever apply an XSLT transform to it, right!?

Good times.

> After all, how on earth could one ever apply an XSLT transform to it, right!?

There is streaming APIs for XML. Just as XSLT 3.0 can do streaming. Saxon has implemented it, for example[1]. I am aware, that you are talking about the past, but also the XML world moves forward, albeit slowly, since the community has gotten much smaller.

[1]: https://www.saxonica.com/html/documentation10/sourcedocs/str...

It was common to have to parse below XML (angle bracket counting) to convert each stanza to an XML element, then parse those as separate XML documents.

You'd also have to explicitly turn XML namespace support _off_, since so many systems didn't actually support them. XML defines well-formedness and namespace-well-formedness as two different things, and you didn't want to completely drop communication because the other side was sending a message that didn't meet the more stringent requirement.

Some implementations would figure out ways to incrementally disrupt and extract elements from the DOM - but this would sometimes cause resource leaks due to the design of the W3C DOM itself.

The expat had explicit support for parsing Jabber/XMPP messages very early, and was by far the most often XML component used for making libraries.

That's what Jabber is? Streaming XML? Whoa boy...
I really wanted to like XMPP back in the days, but honestly I always ended up feeling that the protocol is just bad. This idea of opening an XML document at the beginning of the stream, then only allowing a subset of XML and all the mess with the xmlns, all that bringing really nothing to the table except complexity.

I think ideally in a good protocol, the server should not have to parse the content for the messages that are not targetted to itself (only the metadata useful for routing). the XML mess makes it impossible to do that since you have to validate the full document.

At the time I think this page was a good summary of the issues https://about.psyc.eu/XMPP No idea if this is still relevant though.

I've built an XMPP client for an internal application. My impression was the protocol was overly complex, starting with the lower level problems you describe ("streaming XML.")

It's been years since I've looked at. Maybe things are better now.

There were several members of the core team, pre-XMPP effort in IETF, who wanted to change it to have framing. If not a length-prefixed model, a nil-separated one.

There were ideas to separate out the addressing/routing from the actual messaging, so that servers did not need to process XML and so that messages did not necessarily need to be XML.

There were even some very preliminary ideas on using the servers to potentially negotiate peer connections for arbitrary traffic.

Back in the early 2000s there were a _lot_ of self-hosted Jabber servers though, and there was push-back from early commercial interests on any protocol-breaking changes resetting adoption to zero. This resulted in Jabber 1.0 pretty much becoming the basis of the XMPP RFC, with XMPP adding new authentication techniques and internationalized JIDs.

Later on, there were efforts to establish alternative transports to accomplish some of these alternative transports and forms - HTTP endpoints to poll for messages, JSON mappings of the core messages, etc.

I would argue against PSYC's claims (or would have, back in the day) that the usage of XML in XMPP is not proper, however. Its proper, it just wasn't the best idea.

Well, I guess it kinda makes sense? Beats having to reinvent a tokenization format from scratch?

Of course, smaller messages (à la Matrix) probbably make more sense.

Ultimately the issue is that XML is a document markup language, not a purpose built streaming protocol. You can kind of force it into that role, but the parser ends up being overly complex and issue prone. It's like basing a chat app around creating a MS Word document of every message and sending it across the network.
I still find it amusing that XMPP by design violates XML spec, thus requiring its own custom XML parser.
It does not violate the spec, but it violates the expectations of a lot of XML tooling for sure.
It uses a subset of the XML spec, it does not need a custom parser, any existing parser will do.