Hacker News new | ask | show | jobs
by hot_gril 1168 days ago
Every key is written twice, for opening and closing. Keys can be duplicated, and in fact that's what you have to do if you want a simple list. There aren't numeric types, so you have to parse strings. It also looks horrible.

  <cds>
    <cd><title>Led Zeppelin II</title><artist>Led Zeppelin</artist><price>999</price></cd>
    <cd><title>La Brise<title><artist>Arax</artist><price>999</price></cd>
  </cds>
or

  <cds>
    <cd>
      <title>Led Zeppelin II</title>
      <artist>Led Zeppelin</artist>
      <price>999</price>
    </cd>
    <cd>
      <title>La Brise<title>
      <artist>Arax</artist>
      <price>999</price>
    </cd>
  </cds>
vs something like

  [
    {"title": "Led Zeppelin II", "artist": "Led Zeppelin", "price": 999},
    {"title": "La Brise", "artist": "Arax", "price": 999},
  ]
You can probably do better using XML attributes. But then you're using more features.
3 comments

If we are complaining about the closing tags, might as well add that embedding newlines or quotes into JSON is less than pleasant.

Which is to say, this feels a touch of a non-issue. Yes, writing it by hand can get tedious, but that is true of any and every format. Is why you will almost certainly reach for other formats if doing a long list of data. And each and every one of them will fail for some form of input in ways that is frustrating.

Writing that JSON example by hand wasn't tedious. The XML example was, and the result is unreadable. It's important to be able to debug things easily. I'm going to manually type JSON when I'm testing an API, and I'm going to read the response.

If you absolutely don't care about human interface, no reason to use XML either. It's meant to be more verbose. The XML tags will often dominate the size of the payload with things like `<question>Who</question>`, so you have to start thinking about shorter names. Yes JSON has a similar problem, but at least it's halved and you don't have to instruct everyone to call each list element "e". If you super care about size, you'll use protobufs or something.

<question>Who</question> "question":"Who",

To me, this does not seem like a win that's worth much, especially since it's likely to shrink considerably even with naive fast compression.

Furthermore, as messages grow in size, the explicitly named closing tag actually kind of starts helping.

Both of these syntaxes have their annoying quirks, for sure, and I understand you really dislike the closing tag; that clearly doesn't bother many people.

But regardless of personal preference, I'm really skeptical any of this really explains json's relentless path to replace (most) xml. Other reasons, such as the extreme wordiness some xml apis chose, the poor implementation of namespaces, the problems with embedding arbitrary data (in particularly control characters), the inconsistency between attributes and elements, the lack of support for numbers, the lack of (conventional) support for key-value pairs - all of these surely played a much greater role than a fairly limited syntax issue.

And it's not even like json is without impractical quirks; lack of comments, the ban on trailing commas, and the need for quotes in object-keys spring to mind. Yet those don't mean json is likely to die out soon - even though even javascript itself from which it is derived doesn't suffer from those (anymore)!

Not wrong, but also probably not really indicative of problems or actual use. And while I will be manually typing some data to go into an api for testing, I'm far more likely to by typing it in a thing that is was looser in what it accepts than a json document. Literally today just using dicts in python. And even then, my debugging is dominated by mistakes in data entry there.

Also, I see you took it to be a full on defense of XML. I did not really intend it that way. I think both can be fine. And insisting on either is likely a mistake.

I do find your nitpicks here amusing, still. Size of tag is just as obnoxious as size of key. And, though it can dominate the textual representation, there are clear ways to reduce that. Even knowing that BSON and Binary XML exist, though, I'd be hard pressed to say any project that failed because they weren't using them.

JSON vs XML isn't going to make or break your project. But why would you use XML for data interchange. It makes sense for things like HTML where you're writing a document, but otherwise, it's usually just a needless burden.

Like, if I were there when XMPP was created, yes I would have insisted on JSON. XML was a plainly bad choice. Edit: Oh, JSON didn't exist until a little later. Maybe something similar did.

I mostly agree. I do think Jupyter choose wrong by picking JSON for their documents. They are literally marked up source documents.

XML does have the "benefit" of being a bit more extensible than JSON. Specifically, being able to have namespaced elements in there does make some sense on paper. For example, you could have two extensions both add in data using the same keys, but different namespace. Can't really do that with JSON.

In practice, I think it just fell flat due to way too much "forethought" in things they anticipated people wanting.

Yes, XML is probably a good fit for something like Jupyter. Basically if you want to reuse a lot of "objects" throughout a structure and have the mean the same thing in different nested parts of it. Like how <a> in HTML means a hyperlink whether it's under <body> or some nested <div>.
Yeah, now express this in JSON:

   <div>
     <p>JSON example:</p>
     <pre>
      [
        {"title": "Led Zeppelin II", "artist": "Led Zeppelin", "price": 999},
        {"title": "La Brise", "artist": "Arax", "price": 999},
      ]
     </pre>
     <p>Source: <a href="https://news.ycombinator.com/item?id=35472014">click here</a>!</p>
   </div>
JSON is great for a certain domains, but there are other domains where it is a nightmare and XML shines.

Use the right tool for the job.

you can't ignore ux stuff like this in a protocol that's meant for general use

something like duplicating info in closing tags in XML (which applies to every element) isn't really comparable to stuff like having to escape certain characters in JSON strings (which applies only to the values use those things)

perfect is the enemy of the good, and the good is the metric

Don't you also have to escape stuff in XML? Like &gt, which is even worse.
Yes, though many languages have lenient parsers. Most browser parsers, for example, will probably only be lenient if parsing "HTML."

    new XMLSerializer().serializeToString(new DOMParser().parseFromString("<a>hello < </a>", "text/html")) 
The above in my console does as expected there. And again, entities are a very dangerous part of XML and friends.

You are correct that if you tell it that that is xml, the browser will throw it back at you. Just as the JSON parser will barf on JSON.parse("{'test':'value'}").

per specifications, json parsing is not lenient, html parsing is lenient
Right, and amusingly, more than a few json parsers are very lenient in this. That or folks abandon ship fairly quickly and go for another spec that is far more friendly.
To be pedantic, html parsing is not lenient, it is unambiguously specified.
Thanks. I get your point about the close element including the tag name - but that's the kind of detail I leave to the serialisation library, in the same way that the close scope token in json is different to the start scope token.

As for "looks horrible"... well yeah, I always feel that xml looks "spikey" somehow. But I've been programming in curly-brace languages for 30+ years and I still find json harder to read than xml: I think my brain tries to interpret it as code, not data. I find xml easier to read (even when its unformatted) precisely because the close-tokens kind of document what element they're closing.

Each to their own I guess. At least we're not stuck using ASN1.

> At least we're not stuck using ASN1.

Prepare for trouble, and make it double: http://xml.coverpages.org/dstc-xer2.html

And if someone is nice enough to stuff a NUL in the document, it all shatters.