Hacker News new | ask | show | jobs
by lisper 4048 days ago
> I really do not see anything useful in having attributes

That's because you chose examples that show attributes at their worst.

Suppose that instead of a document with a single author you instead had a document to which many people contributed, and you wanted to mark it up to show who wrote which section. Using attributes (and identifiers) you would have, e.g.

    <author id=Bob>[info about Bob]</author>
    <author id=Alice>[info about Alice]</author>
    <span author=Bob>This part was written by Bob</span>
    <span author=Alice>and this part was written by Alice</span>
    <span author=Bob>and this part was written by Bob again.</span>
This example also highlights why it is sometimes NECESSARY to use identifiers in order to produce the semantically correct structure. Suppose you put all the author information in-line as you suggest. The result would look something like this:

    <span><author><name>Bob</name>[info about Bob]</author>This part was written by Bob</span>
    <span><author><name>Alice</name>[info about Alice]</author>This part was written by Alice</span>
    <span><author><name>Bob</name>[info about Bob]</author>This part was written by Bob</span>
Were the first and third parts written by the same person, or by two different people whose names both happen to be Bob? If you put everything in-line there is no way to express that two pieces of structure are intended to be EQ to each other.
1 comments

Did you not read my reply, really?

I already mentionned that with a Lisp like data-format, shared sub-expressions could be denoted using CL's reader variables:

      (document
        #1=(author (id "Bob") ... )
        #2=(author (id "Alice") ... )
        (span (author #1#) "written by Bob")
        (span (author #2#) "written by Alice")
        (span (author #1#) "written by Bob"))
I do not claim that this is the most appropriate solution in all cases, just that we are not forced to introduce indirection levels when unnecessary. Now, if I am using Lisp and I want to introduce external references to authors described in other documents, I could introduce a meta-data with an appropriate semantical structure:

       (external-element (pathname (directory (relative "path" "to")) 
                                   (type "lisp")
                                   (name "file")) 
                         (tree-path 2 1 3 2 2 3))
This would be a practical way to encode a precise location in a tree in an external file. And I could use this form everywhere I need to reference an object. Also, the tree-path notation is handy because there is no distinction between an attribute or an element, just which branch to take at each step from the root.

Now, with XML attributes, I would typically have an "xref" attribute. How can we model xref attributes? If we wanted to have structured data, we would need to create external tags with the same concepts as above, like <pathname>, create a local identifier for each xref and refer indirectly to each xref using their local identifier: because we can only put strings. I mean:

     <author xref="xref02"> 
     ...
     <xref id="xref02">
       <pathname> ... </pathname>
       <tree-path> ... </tree-path>
     </xref>
Or, we do as everybody and encode it like for XMI, or ECORE, or any other custom format, with a complex string, hoping that HTML entities are properly escaped.

Besides, you failed to notice that you had <author> tags, which precisely goes against your idea that there should be a place for "meta-data" and a place for "data": effectively, authors are now part of the content of the document, and are not only meta-informations.

If you think my examples are artificial, open the source code of this page, and observe how any kind of complex information written in attributes has to be properly escaped to bypass the limitation of stringly-typed data:

       reply?id=9556252&amp;goto=item%3Fid%3D9555880"

       href="vote?for=9556252&amp;dir=up&amp;auth=0UU000REDACTED000208d8b9f4a45575b4edea3779&amp;goto=item%3Fid%3D9555880"
Notice how you need to escape HTML entities in inline javascript attributes (onclick) but not on script tags. Why are inline javascript not tags instead?

(see http://stackoverflow.com/questions/8749001/escaping-html-ent...).

Whatever example you choose, you cannot deny the fact that attributes are not given the same rights as elements, because the way they do not allow to contain structured data or cannot have meta-attributes themselves.

> Did you not read my reply, really?

I did read it.

> I already mentionned that with a Lisp like data-format, shared sub-expressions could be denoted using CL's reader variables:

Yes, of course this is possible. But that's just a different way of implementing tags (and not even a very good one either because your tags are constrained to be numeric).

> we are not forced to introduce indirection levels when unnecessary

That's a tautology.

> I could use this form everywhere I need to reference an object.

Of course you could. Most problems have more than one reasonable solution. But pointing out one reasonable solution is irrelevant to the question of whether a different solution is also reasonable.

> your idea that there should be a place for "meta-data" and a place for "data"

That wasn't exactly my idea. What I said was that there was value in having a syntactic distinction between data and meta-data. But I didn't say that this distinction should be universal. In fact it is impossible to distinguish between data and metadata in general, so you can always come up with examples where a particular datum's role is ambiguous. That doesn't change the fact that in many practical circumstances, having a syntactic distinction is appropriate and useful.

> observe how any kind of complex information written in attributes has to be properly escaped

Again, citing circumstances where things fall apart does not change the fact that in many practical circumstances, having a syntactic distinction between data and meta-data is appropriate and useful.

If you choose to reply to this, please remember: I'm a Lisp fan. (Look at my HN user ID!) I hate XML. I much prefer S expressions. When I have to deal with XML, the first thing I do is parse it into S-expressions. The world would be a better place if everything were S-exprs no one used SGML or any of its devil spawn syntaxes. But that's not the world we live in. In the world we live in, where markup languages exist and are required to have matching end tags, attributes are a defensible design.