Hacker News new | ask | show | jobs
by hot_gril 1168 days ago
Writing that JSON example by hand wasn't tedious. The XML example was, and the result is unreadable. It's important to be able to debug things easily. I'm going to manually type JSON when I'm testing an API, and I'm going to read the response.

If you absolutely don't care about human interface, no reason to use XML either. It's meant to be more verbose. The XML tags will often dominate the size of the payload with things like `<question>Who</question>`, so you have to start thinking about shorter names. Yes JSON has a similar problem, but at least it's halved and you don't have to instruct everyone to call each list element "e". If you super care about size, you'll use protobufs or something.

3 comments

<question>Who</question> "question":"Who",

To me, this does not seem like a win that's worth much, especially since it's likely to shrink considerably even with naive fast compression.

Furthermore, as messages grow in size, the explicitly named closing tag actually kind of starts helping.

Both of these syntaxes have their annoying quirks, for sure, and I understand you really dislike the closing tag; that clearly doesn't bother many people.

But regardless of personal preference, I'm really skeptical any of this really explains json's relentless path to replace (most) xml. Other reasons, such as the extreme wordiness some xml apis chose, the poor implementation of namespaces, the problems with embedding arbitrary data (in particularly control characters), the inconsistency between attributes and elements, the lack of support for numbers, the lack of (conventional) support for key-value pairs - all of these surely played a much greater role than a fairly limited syntax issue.

And it's not even like json is without impractical quirks; lack of comments, the ban on trailing commas, and the need for quotes in object-keys spring to mind. Yet those don't mean json is likely to die out soon - even though even javascript itself from which it is derived doesn't suffer from those (anymore)!

Not wrong, but also probably not really indicative of problems or actual use. And while I will be manually typing some data to go into an api for testing, I'm far more likely to by typing it in a thing that is was looser in what it accepts than a json document. Literally today just using dicts in python. And even then, my debugging is dominated by mistakes in data entry there.

Also, I see you took it to be a full on defense of XML. I did not really intend it that way. I think both can be fine. And insisting on either is likely a mistake.

I do find your nitpicks here amusing, still. Size of tag is just as obnoxious as size of key. And, though it can dominate the textual representation, there are clear ways to reduce that. Even knowing that BSON and Binary XML exist, though, I'd be hard pressed to say any project that failed because they weren't using them.

JSON vs XML isn't going to make or break your project. But why would you use XML for data interchange. It makes sense for things like HTML where you're writing a document, but otherwise, it's usually just a needless burden.

Like, if I were there when XMPP was created, yes I would have insisted on JSON. XML was a plainly bad choice. Edit: Oh, JSON didn't exist until a little later. Maybe something similar did.

I mostly agree. I do think Jupyter choose wrong by picking JSON for their documents. They are literally marked up source documents.

XML does have the "benefit" of being a bit more extensible than JSON. Specifically, being able to have namespaced elements in there does make some sense on paper. For example, you could have two extensions both add in data using the same keys, but different namespace. Can't really do that with JSON.

In practice, I think it just fell flat due to way too much "forethought" in things they anticipated people wanting.

Yes, XML is probably a good fit for something like Jupyter. Basically if you want to reuse a lot of "objects" throughout a structure and have the mean the same thing in different nested parts of it. Like how <a> in HTML means a hyperlink whether it's under <body> or some nested <div>.
I'd phrase it more that there is a document with mixed use items marked up throughout it. Some items in the document are code, in which case you probably want to fence the code with a marker on what language is used. Other items are just prose, in which case you'd like to just write the prose as much as you can.

Some items can even be other forms of xml that have their own schemas dictating what is valid. (Thinking SVG here.)

I'll also note that even there, I can see why HTML went with the odd parsing they do. XHMTL tried going with "well formed" documents, but that falls flat for the authors. Is why "sections" of a document are essentially just collecting all of the "h" tags and making an implied tree out of that. As opposed to making the tree directly. To that end, my markup language of choice for Jupyter style things is org-mode in emacs. Yes, it has some warts; but again, all formats that I have ever seen have warts.

Edit: I want to add that I don't intend this as a "correction." I should say that I agree with your post. Complicated field where I doubt I'd have done better than most others. :)

Yeah, now express this in JSON:

   <div>
     <p>JSON example:</p>
     <pre>
      [
        {"title": "Led Zeppelin II", "artist": "Led Zeppelin", "price": 999},
        {"title": "La Brise", "artist": "Arax", "price": 999},
      ]
     </pre>
     <p>Source: <a href="https://news.ycombinator.com/item?id=35472014">click here</a>!</p>
   </div>
JSON is great for a certain domains, but there are other domains where it is a nightmare and XML shines.

Use the right tool for the job.