|
|
|
|
|
by winstonewert
440 days ago
|
|
What actually prevents JSON from being used in these spaces? It seems to me that any XML structure can be represented in JSON. Personally, I've yet to come across an XML document I didn't wish was JSON, but perhaps in spaces I haven't worked with, it exists. |
|
Well it can't: JSON has no processing instructions, no references, no comments, JSON "numbers" are problematic, and JSON arrays can't have attributes, so you're stuck with some kind of additional protocol that maps the two.
For something that is basically text (like an HTML document) or a list of dictionaries (like RSS) it may not seem obvious what the value of these things are (or even what they mean, if you have little exposure to XML), so I'll try and explain some of that.
1. Processing instructions are like <?xml?> and <?xml-stylesheet?> -- these let your application embed linear processing instructions that you know are for the implementation, and so you know what your implementation needs to do with the information: If it doesn't need to do anything, you can ignore them easily, because they are (parsewise) distinct.
2. References (called entities) are created with <!ENTITY x ...> and then you use them as &#x; maybe you are familiar with < representing < but this is not mere string replacement: you can work with the pre-parsed entity object (for example, if it's an image), or treat it as a reference (which can make circular objects possible to represent in XML) neither of which is possible in JSON. Entities can be behind external URI as well.
3. Comments are for humans. Lots of people put special {"comment":"xxx"} objects in their JSON, so you need to understand that protocol and filter it. They are obvious (like the processing instructions) in XML.
4. JSON numbers fold into floats of different sizes in different implementations, so you have to avoid them in interchange protocols. This is annoying and bug-prone.
5. Attributes are the things on xml tags <foo bar="42">...</foo> - Some people map this in JSON as {"bar":"42","children":[...],"tag":"foo"} and others like ["foo",{"bar":"42"},...] but you have to make a decision -- the former may be difficult to parse in a streaming way, but the latter creates additional nesting levels.
None of this is insurmountable: You can obviously encapsulate almost anything in almost anything else, but think about all the extra work you're doing, and how much risk there is in that code working forever!
For me: I process financial/business data mostly in XML, so it is very important I am confident my implementation is correct, because shit happens as the result of that document getting to me. Having the vendor provide a spec any XML software can understand helps us have a machine-readable contract, but I am getting a number of new vendors who want to use JSON, and I will tell you their APIs never work: They will give me openapi and swagger "templates" that just don't validate, and type-coding always requires extra parsing of the strings the JSON parsing comes back with. If there's a pager interface: I have to implement special logic for that (this is built-in to XML). If they implement dates, sometimes it's unix-time, sometimes it's 1000x off from that, sometimes it's a ISO8601-inspired string, and fuck sometimes I just get an HTTP date. And so on.
So I am always finding JSON that I wish were XML, because (in my use-cases) XML is just plain better than JSON, but if you do a lot in languages with poor XML support (like JavaScript, Python, etc) all of these things will seem hard enough you might think json+xyz is a good alternative (especially if you like JSON), so I understand the need for stuff like "xee" to make XML more accessible so that people stop doing so much with JSON. I don't know rust well enough to know if xee does that, but I understand fully the need.