Hacker News new | ask | show | jobs
by jjawssd 3327 days ago
> XML also has a lot of room for syntax errors, with many types of tokens and escape rules. JSON, by comparison, does not.

Parsing JSON is a minefield.

Yellow and light blue boxes highlight the worst situations for applications using the specified parser. Take a look at how a bunch of parsers perform with various payloads: http://seriot.ch/json/pruned_results.png

"JSON is the de facto standard when it comes to (un)serialising and exchanging data in web and mobile programming. But how well do you really know JSON? We'll read the specifications and write test cases together. We'll test common JSON libraries against our test cases. I'll show that JSON is not the easy, idealised format as many do believe. Indeed, I did not find two libraries that exhibit the very same behaviour. Moreover, I found that edge cases and maliciously crafted payloads can cause bugs, crashes and denial of services, mainly because JSON libraries rely on specifications that have evolved over time and that left many details loosely specified or not specified at all."

More details available at: http://seriot.ch/parsing_json.php

2 comments

None of these issues are as bad as the XML ones. You generally don't need "defusedjson" like you need https://pypi.python.org/pypi/defusedxml

<!DOCTYPE external [ <!ENTITY ee SYSTEM "file:///etc/ssh/ssh_host_ed25519_key"> ]> <root>&ee;</root>

Parser correctness is irrelevant when you're talking about the ability to be written with few syntax errors. For instance, JSON has one type of string with one set of string escape rules. XML has element names, attribute names, attribute values, text nodes, CDATA content, RCDATA content, and more. And almost all of them have different rules for what they can contain and how they can be used.

By comparison, XML is orders of magnitude more complex than JSON.