|
|
|
|
|
by jerf
1486 days ago
|
|
Unfortunately, the problem here is programmers moreso than formats. It literally doesn't matter what you specify, programmers will not implement it to a T. Most programmers simply don't know that every single detail matters. Many of those who may have some idea don't really care, since they can't imagine how something like this could happen. It's not just XML. It's every ecosystem I've ever used. Push it around the edges and you will find things. This is neat, not because it is special to JSON in particular but because it's an example of examining a good chunk of a large ecosystem: https://seriot.ch/projects/parsing_json.html Consider this is likely to be true in any ecosystem that doesn't make it a top priority to avoid. |
|
For example how many Protobuf parser libraries have security bugs? I'm guessing very few because the standard is nice and simple, and it's very clearly defined without much "it's probably like this" wiggle room (much easier for binary formats!).
XML had a ton of unnecessary complexity that could have been avoided to make implementations simpler. I haven't actually read this bug so let's see if it was one of:
* Closing tags having to repeat the name / two different ways of closing tags.
* CDATA
* Namespaces (especially how they are defined)
* &entities;
Edit: Ha it wasn't any of those - but it was still an issue with text based formats. Seems like Expat assumes the content is valid UTF-8 (and doesn't validate it), while Gloox assumes it is ASCII. Obviously this couldn't have happened with binary formats.
If you care about security DON'T USE TEXT FORMATS!