Hacker News new | ask | show | jobs
by joepie91_ 1523 days ago
> For messaging purposes, it is absolutely identical to JSON in this regard. Once you have many existing implementations out in the wild that need to interact with each other, you'll need namespaces or their NIH analogue. And JSONs with namespaces will be just as bloated as XML.

It's definitely not identical. There's one major difference, that rarely seems to get mentioned but seems to be responsible for most of the pain:

JSON maps directly to language-native data structures, but XML does not.

The reason for this is that XML distinguishes between child nodes and attributes, but in the basic data structures that most languages have, there's no way to represent this in a conflict-free manner without doing something like `{ attributes: { ... }, children: [ ... ] }`, which is an awkward structure to query.

The result is that when you're working with XML, you almost always end up with a "data API" that's designed specifically for XML (eg. classes representing nodes with getters for attributes and children), whereas with JSON you're generally working with simple nested lists and structs.

This introduces relatively much complexity in what's a "hot path" in terms of development effort - every single time you want to interact with the deserialized data (which is all the time!), you have to deal with an XML-specific data structure instead of treating it as "just another source-agnostic pile of structs/objects".

The result is tightly coupled code that requires the reader to be familiar with the specific XML API/representation being used, instead of it 1:1 mapping from the serialized data to a language-native data structure. That's not good!

While it is true that you are likely to implement a namespace-like mechanism in JSON as well, for open protocols, the difference is that the format itself does not specially define namespaces - which means that the 1:1 mapping always remains true, and it's the protocol developer's choice to eg. reserve a `children` property to contain child nodes, with the understanding that that property then cannot be used for other things anymore.

XML deserializers cannot do this; it's entirely possible for XML to exist that looks like `<tag children="yes"><subtag/></tag>`, and therefore there is no possible 1:1 mapping that a deserializer can use that's guaranteed not to make certain data inaccessible. And since child nodes are not represented as an attribute in XML's data model, you cannot expect protocol designers to avoid a specific attribute either.

Bottom line: I suspect that this seemingly small design difference has far-reaching second-order effects in how deserializers are implemented, and that that is responsible for most of the pain that people experience in working with XML.

1 comments

> JSON maps directly to language-native data structures, but XML does not.

So what? It is irrelevant. In all the problems we have ever faced when creating federated chat applications on various platforms, I can't recall a single time when stream format made any difference.

For example, making your messages appear in consistent order on all of your and your chat partner's devices is a real problem. The development of the efficient strategy to achieve this has nothing to do with format, yet, has a much more visible impact on user experience. In fairness, the debate of what is better suited for the protocol development, Message Sequence Charts [1] or Sequence Diagrams [2] has more sense than debating about XML vs JSON.

[1]: https://en.wikipedia.org/wiki/Message_sequence_chart

[2]: https://en.wikipedia.org/wiki/Sequence_diagram