Hacker News new | ask | show | jobs
by bayindirh 1176 days ago
No. JSON is great as Javascript's serialization format, but it's not as readable and robust as XML, period.

I use both extensively, and for bigger objects and definitions, XML is a very clear winner.

I'm a big believer in horses for courses type of approach, and my personal gripe is the push to replace one thing with another. These data types can coexist, and can be used where they shine. XML can be read and written stupidly fast, so it's way better as a on disk file format if people gonna touch that file.

YAML and JSON are not the best fit for configuration files. JSON is good as an on-disk serialization format if humans not gonna touch that. XML is the best format for carrying complex and big data around. TOML is the best format for human readable, human editable config files.

4 comments

My only quip is both are basically unreadable in most use cases. Most programs worth anything that use these formats usually strip out all the extra spaces and formatting. You usually have to take an extra step to 'reformat' just so you can read it. And anyone who has had an open paren or carrot or missing could show how painful manually parsing a 400+ field one of these is. Trying to say one is better than the other ignores the use cases for both. One being good at slugging data into javascript/python. The other being good at light typing, annotation and transform.
I never seen a tool which stores its XML config in a minified/uglified form by removing whitespace. The biggest two tools I play and which use XML are Keycloak and Eclipse, and none of them do this.

All of the parsers I used, and editors I have edited XML always shown the correct place where a caret is missing or XML is broken in anyway, so I have never hunted anything down inside a big XML file.

However, this doesn't invalidate your experience about unreadable XML files, which are most definitely present in the wild.

However, I agree that none of them are good config file formats, but storing data, I'll take XML all day, every day (except when I really need a binary file format, e.g.: for compressing data).

What specifically about XML means it can be read/written "stupidly fast"?

It's still a text bound serialization format, you still have to parse a tree for it.

Is it just particularly mature libraries?

It is primarily mature libraries, but also XML is more straightforward to parse, because there are not many data types and tags makes it very deterministic.

By "stupidly fast", I mean I can read a 120K XML file, parse it, create the objects which generated from that file definition under 2ms. The library I use (RapidXML [0]) can parse the file almost with the same time cost of running strlen() on the same file. That's insane.

[0]: https://rapidxml.sourceforge.net/

Being a maintainer of the fastest XML library for Rust, I strongly disagree that XML is inherently fast to parse, and I question any such claim which comes with no evidence. Especially when it has remained unchanged on their page since (at least) 2008 [0]. Have you actually tested that claim or are you taking it at face value?

IME the XML spec is so complex that you either end up with a slow but compliant parser or a fast one that doesn't implement the spec completely.

JSON, unlike XML, is minimal enough that writing an entire compliant parser with SIMD intrinsics [1] is actually practically feasible. That library claims 3 GBps parsing speed, which could theoretically process your 120kb of data in 1/25000th of a second instead of 2/1000ths of a second.

I would wager that JSON is faster to parse, on balance.

[0] https://web.archive.org/web/20080209172554/https://rapidxml....

[1] https://github.com/simdjson/simdjson

YAML is excellent as a post natal abortion mechanism. Anyone working on its parser will question why live when YAML exists. Source: I'm developing a YAML parser.

What broke me were: plain string and empty node handling.

Here is a fun quiz. Which of these two documents or both or neither are valid. With explanation ofc.

Yaml#1

     :
Yaml#2

    :
XML is great except at being a configuration format, a messaging format, a serialization format, or any other purpose really. It's not insane like YAML I'll give it that. I'll take XML over that garbage any day.