Hacker News new | ask | show | jobs
by abarrak 2778 days ago
I grouped XML quotes to share with my manager:

The essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.

        — Phil Wadler, POPL 2003
XML is like violence: if it doesn’t solve your problem, you aren’t using enough of it.

        — Heard from someone working at Microsoft
XML is like violence. Sure, it seems like a quick and easy solution at first, but then it spirals out of control into utter chaos.

        — Sarkos in reddit
Most xml i’ve seen makes me think i’m dyslexic. it also looks constipated, and two health problems in one standard is just too much.

        — Charles Forsyth
Nobody who uses XML knows what they are doing.

        — Chris Wenham
3 comments

That page misses my favorite XML quote, which is also on cat-v at http://harmful.cat-v.org/software/xml/:

XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both.

  - Matthew Might
What are some valid complaints about XML? I was talking to one of the older IT guys at my company a while ago, and he was all about it because it made serializing data structures very simple. I'm not sure if that's a valid use case or an example of the kind of monstrosity that XML-haters hate.
If all you need to do is serialize data structures, we solved that problem 30 years ago—including all the schema-validation and querying stuff—with ASN.1 (and then re-solved it with Protobufs and Thrift, and solved it again in half-assed manners with JSON, YAML, etc.)

XML is a markup format, a regularization/minimization of the syntax of SGML. It's good at being a markup format—the paired named open+close tags allow for corrupted-stream repair in a way that e.g. Markdown just doesn't. XML is great in, say, DocBook.

But XML gets used for pretty much everything except actual markup. And for everything else, it's not solving those problems well.

It's very complicated. Leading to severe security problems, which means you basically should avoid any untrusted client supplied xml, if you can at all avoid it. It is a step backward from s-expressions, being harder to parse for both machines and humans whilst being less expressive. Whitespace handling is comically bad. It doesn't have a working comment syntax. The list goes on. The only excusably bad thing about xml are namespaces. The design is not a success, but it springs from a list of desiderata which look quite sensible on first sight.
XML conflates a bunch of stuff (XSLT, XPath, Schemas, namespaces, etc....) making it overly complicated

Then it has 3 different sets of data, elements, attributes, content. In other words <a c="foo">bla<b/>bla</a> is mess to deal with. Now replace the "bla" with spaces and assume <a> is supposedly only supposed to have <b> as a child except really it will have 3 children, the one before the <b> and the one after.

I think this is why JSON won over XML. It seems like all those extra features on XML would be a benefit but in reality they just give you more places to hang yourself.

It does not actually solve a problem. Ok, I cannot disagree with derefr's comment here regarding text markup. When it comes to serialization... it does not define how to serialize data items. But it has different mechanisms to group them: attributes and nesting. Nesting is usually wrong (one reason is it only works for 1:n relations, so you need named cross references (a.k.a foreign keys in a saner world) anyway). And there are multiple obligatory escaping schemes (quoted attributes, body text, and maybe we could also count tag names and attribute keys).

The hierarchy stuff is so complicated and wrong that I've personally never witnessed a collegue actually using a data schema. Which means data does not actually get validated.

On top of that it is near unreadable. Compare to something simple like http://jstimpfle.de/projects/wsl/world_x.wsl.txt

The whitespace situation is really bad and makes it all also not canonical at all.

I have to second this, I do not understand why XML receives so much hate. Sure it is verbose und editing it without, at the very least, good syntax highlighting can be a bit of a pain. And namespaces. But just use a good XML editor and the pain is gone. And in return you get a mature and powerful ecosystem where communicating data formats and validating data with XSDs or transforming data with XSLTs is done easily. And you have to write almost no code if your language has a decent XML library. Serialization and deserialization is usually easy and there are tools for generating matching classes for your schemas. I am not really following the development of other serialization formats like JSON, but as far as I can tell at least the JSON ecosystem is essentially coming up with analog ideas like JSON schema to solve the same problems that XML has already solved.
I find XML good for configuring pipelines since there's a clear "parent" node. Most other things I use JSON for. In other words, XML is good when a human is going to be editing it, but the ROI isn't enough to make a GUI for.
I don't think it's true that XML - in itself - makes serializing data structures very simple. An easy to use XML library might. But that library could use any structured data file format at all, and still make serializing data structures very simple.
Years ago there was a list of top 10 lists I had printed out, but I haven't been able to find them. One of the lists was "Top 10 signs you're a Microsoft programmer" and #1 or 2 was something along the lines of: "You think human teleportation will eventually be possible, and XML will be the transport."