Hacker News new | ask | show | jobs
by cbolat 1334 days ago
Honestly, why XML? Isn't JSON 100 times better, smaller and easy to use than XML? why choose a legacy format (with known security problems in parsers, compability, and unnecessary use of bandwidth) in 2022?
7 comments

> Isn't JSON 100 times better, smaller and easy to use than XML?

Quite certainly not. It's not even more popular if HTML is included.

It may be smaller, but not nessecarily so: both json and XML are easy to compress due to repetitive (overhead) characters. Uncompressed it depends on the use-case and implementation: the ability to have both attributes and content on a node, allow (but certainly not always are) XML to be smaller than JSON which does not have this.

Easy to use depends on the features: JSON is gaining complexity rapidly (json-ld, json-templates, jsonschemas etc) to fill up what XML can do OOTB. Sure: an all-out XML (XMLT, DTD, etc) is far more complex than a simple JSON. But hardly more than a JSON with JSTL, JSon-schema etc. The exact same performance and security problems arise in JSON with all these features bolted on.

In other words: "it depends". But the idea that "JSON is 100x better" is repeated oft in the tech scene, yet impossible to back up in general. JSON may certainly be better for your case. But so can XML.

> Quite certainly not. It's not even more popular if HTML is included.

Nowadays HTML is not XML anymore [1].

[1]: https://stackoverflow.com/a/39560454/735926

What?! You mean everyone is just endlessly reinventing the wheel? You mean they start out simple then quickly realise much of the complexity of the prior is useful?!

Well I never!

I wouldn't call JSON a reinvented wheel. It has different tradeoffs, and so fills a real niche that XML couldn't.

However, we, the engineering community tend to fall in this hammer-nail mindset. Where we know and like JSON, and then start applying it outside of its fitting niche.

We start using it for configuration (only to find out it's lack of features such as not allowing comments makes it fully unusable).

We invent stuff like symlinks, linked data, schemas, JSLT, etc. rather than moving to a much more appropriate format, we change the inappropriate format into an abomination.

Tooling around JSON schema is lacking. For example, we have a fairly simple XSD which still requires the latest JSON schema version due to a choice element (IIRC), which Swagger doesn't fully support yet.
I think the reason is that mostly people don't define schemas for their json to begin with.

Back when xml was still commonly used, xml schemas were optional as well. The notion of insisting on adding longish urls to every attribute and namespacing elements and attributes, kind of defeated the purpose of using xml. It made everything harder; including parsing, xsl transformations, xpath expressions, etc.

"adding longish urls to every attribute and namespacing elements and attributes"

Who thought this was a good idea?

With EXI encoding the format XML is many times smaller than Json, it can uses XSD to have a scheme aware compression, and it is better structured with XSD.

https://www.w3.org/TR/exi/

Maybe because other standards (https://www.datex2.eu/datex2/about) in the same area are also XML based, I wonder :-)
While i prefer json, I absolutly would not consider xml a legacy format.
Json is terribly unstructured
There's https://json-schema.org for that.
It looks like the primary implementation is in Java, which is still all about SOAP services and XSDs.
We are on Java 19 now. Several that spits out the same sentiment as you seem to be stuck with Java 8 and EE legacy projects from 2012.