Hacker News new | ask | show | jobs
by tehbeard 1176 days ago
What specifically about XML means it can be read/written "stupidly fast"?

It's still a text bound serialization format, you still have to parse a tree for it.

Is it just particularly mature libraries?

1 comments

It is primarily mature libraries, but also XML is more straightforward to parse, because there are not many data types and tags makes it very deterministic.

By "stupidly fast", I mean I can read a 120K XML file, parse it, create the objects which generated from that file definition under 2ms. The library I use (RapidXML [0]) can parse the file almost with the same time cost of running strlen() on the same file. That's insane.

[0]: https://rapidxml.sourceforge.net/

Being a maintainer of the fastest XML library for Rust, I strongly disagree that XML is inherently fast to parse, and I question any such claim which comes with no evidence. Especially when it has remained unchanged on their page since (at least) 2008 [0]. Have you actually tested that claim or are you taking it at face value?

IME the XML spec is so complex that you either end up with a slow but compliant parser or a fast one that doesn't implement the spec completely.

JSON, unlike XML, is minimal enough that writing an entire compliant parser with SIMD intrinsics [1] is actually practically feasible. That library claims 3 GBps parsing speed, which could theoretically process your 120kb of data in 1/25000th of a second instead of 2/1000ths of a second.

I would wager that JSON is faster to parse, on balance.

[0] https://web.archive.org/web/20080209172554/https://rapidxml....

[1] https://github.com/simdjson/simdjson