Hacker News new | ask | show | jobs
by rbehrends 4481 days ago
XML is unnecessarily verbose, for the supposed sake of human readability. But used as a serialization format, it isn't really readable or editable by humans (except in the sense that a Turing machine is programmable): remember that the ML in XML stands for "markup language", and SGML, its predecessor, was designed as a way of marking up normal text, not littering data with angular brackets and identifiers. (XML/SGML arguably isn't that hot as a markup language, either.)

If you really need a hierarchical serialization format that is "verified for validity and syntax", the problem is that XML has prevented the adoption of something better (because it was "good enough").

If you don't need that, then XML is overkill and bloat and makes your format less readable than it could be. And you rarely need it, because either your data is computer-generated and -read, so there's little point in putting in extra schema checks, or schema verification is woefully insufficient (because it can't verify the contents of fields, relations between fields, or a ton of other stuff that can accidentally go wrong).

2 comments

You fail to address OP's question:

> But if I want something that is able to express data structures customized by myself, usually with hierarchical data that can be verified for validity and syntax (XML Schemas or old-school DTD), what other options are there?

He actually did address my question in a way: "[...] XML has prevented the adoption of something better (because it was "good enough")."

Which IMO is a sensible way looking at it. I too think XML is not perfect but if all the other stuff we're stuck with currently would be as good enough as XML, IT would be a place with less WTFs all around. ;-)

I did address it. Did you read my comment until the end?
> the problem is that XML has prevented the adoption of something better

What would be better?

That depends on your specific goals. But essentially, XML schemas are sort of like attribute grammars, except with an unnecessarily convoluted syntax, and yet more limited in their expressiveness than attribute grammars (because whatever constraints you need have to be procrusteanized into XML schemas).

Even if you were to stick with XML semantics as is, you could improve the syntax to be actually readable and eliminate the angle bracket tax [1, 2].

[1] http://blog.codinghorror.com/xml-the-angle-bracket-tax/

[2] http://blog.codinghorror.com/revisiting-the-xml-angle-bracke...

I think S-expressions would have been better.

Alternatively Carl Sassenrath was pushing Rebol in the past. See his blog post "Was XML Flawed from the Start?" - http://www.rebol.com/article/0108.html

Update: Just posted the above blog link to HN: https://news.ycombinator.com/item?id=7361260

I'm a fan of edn myself. https://github.com/edn-format/edn