Hacker News new | ask | show | jobs
by chii 1007 days ago
> not having first class syntax for representing them

i think you're mixing representation of xml data vs the representation of them in a programming language.

XML does have arrays. They care called child elements.

2 comments

No you've misunderstood my point. This doesn't work for cases where one child is in fact a property that is a complex object.

XML claims to solve the problem of attributes vs children but then falls short at the first hurdle by not discerning between a single complex object as an attribute and an array of complex objects as children.

JSON and YAML do not have this problem as they are explicit in their representation.

YAML example:

    parent:
        child: name
vs

    parent:
      - child: name
Try converting each of these to JSON. The former will give you an object property called child, the latter will give you an array property called child with one element
I'm not sure about that - I think your second example will parse to "parent": [{"child": "name"}] in JSON
Yeah this was what I was going for, same point stands
Aren't XML child elements pretty much the most verbose way you can represent an array though?
I think the verbosity is not a problem. For example if you compare

    ["string1", "string2"]
to

    <list>
        <e>string1</e>
        <e>string2</e>
    </list>
then each element has about four bytes overhead (<e> instead of " and </e> instead of ",) plus some overhead for the list itself that may be offset by putting the name of the list itself into the element.

However, the issue is that you have to write a custom parser. There is no direct mapping between your data structure and the XML file. This developer ergonomics is a big win for JSON and consequently YAML.

> There is no direct mapping between your data structure and the XML file.

i think that's by design tbh.

it's only a big win for JSON (and YAML) because the default case works OK - but every time someone has a problem parsing numbers in JSON (because the value is bigger than Integer.MAX in the host language), this is the cause.

Yes, I understand that (and I like XML as a format and XSLT 2.0 as a language). However, from the popularity of JSON, it seems that for most cases it's the easier choice.

Take any random REST API for example. If it returns JSON, you can integrate it more easily than if it returned XML. If you need special cases like large numbers (or date-times), you handle only those.

I'm confused? Integrating XML was fairly easy back in the day. If in a dynamic language, serialize into a DOM and then use xpath to get data out. If in a static language, parse into your objects.

With JSON, you can mostly do the same. Such that I don't necessarily see this as a huge advantage of XML, mind. Having a schema does have some advantages, though.

JSON maps only to javascript, but only because it was designed as a subset of javascript, for others you have to use DOM or serializers, then there's no difference between formats. For this matter, xml has generic serializers than can be used instead of writing custom one every time.
If you interpret the start and end tags of the child elements as syntax indicating the type of each value, then those tags are analogous to, say, the quotes that enclose a string literal. In other words, in

    <foo>hello</foo>
    <foo>world</foo>
the <foo> and </foo> serve the same purpose as the double quotes in

    "hello",
    "world"
with the added benefit that the type system can be much richer (i.e. not everything is just a nondescript string value).

And you don’t even need a comma to separate the values! ;)