Hacker News new | ask | show | jobs
by BrandonM 6512 days ago
But is it really easier to read and write? Properly-indented S-expressions are just as readable. Generating XML and then gzip-ing it is a lot more work (and requires a lot more libraries) than generating S-expressions.

Case in point:

  <table>
    <tr>
      <td>a</td>
      <td>b</td>
    </tr>
    <tr>
      <td>c</td>
      <td>d</td>
    </tr>
  </table>
vs:

  (table
    (tr
      (td a)
      (td b))
    (tr
      (td c)
      (td d)))
Perhaps the real problem is that too many people use terrible text editors. Paren-matching and auto-indentation makes writing S-expressions orders of magnitude easier, and at least a constant factor easier than writing XML.
2 comments

You are right about text-editors. XML was designed to be reasonable easy to write and edit by humans without specialized software. The redundant end-tag helps to catch errors and make structure more explicit.

Sure everyone could just use a fancy specialized editor with paren-matching auto-indentation. But one of the goals of XML was precisely that it should not rely on specialized software to be able to read and write.

Your example with the table is a lot clearer with sexpr syntax because you don't actually have any content in the table. Try again with a few sentences of mixed content, some bolded words, a link, and so on, and you will get my point.

Note that you would also need to gzip your s-expressions if you are concerned about size.

The HTML of your comment, with some formatting added:

        <span class="comment">
         <font color=#000000>
          You are right about text-editors. XML was designed to be reasonable
          easy to write and edit by humans without specialized software. The
          redundant end-tag helps to catch errors and make structure more
          explicit.
          <p>
          Sure everyone could just use a fancy specialized editor with paren
          matching auto-indentation. But one of the goals of XML was
          precisely that it should not rely on specialized software to be
          able to read and write.
          <p>
          Your example with the table is a lot clearer with sexpr syntax
          <b>because you don't actually have any content in the table.</b>
          Try again with a few sentences of mixed content, some bolded
          words, a link, and so on, and you will get my point.
          <p>
          Note that you would also need to gzip your s-expressions if you are
          concerned about size.
         </font>
        </span>
The same thing in S-expressions (an invented syntax):

        (span (class . comment)
          (font (color . #000000)
            You are right about text-editors. XML was designed to be reasonable
            easy to write and edit by humans without specialized software. The
            redundant end-tag helps to catch errors and make structure more
            explicit.
            (p)
            Sure everyone could just use a fancy specialized editor with paren
            matching auto-indentation. But one of the goals of XML was
            precisely that it should not rely on specialized software to be
            able to read and write.
            (p)
            Your example with the table is a lot clearer with sexpr syntax
            (b because you don't actually have any content in the table). Try
            again with a few sentences of mixed content, some bolded words, a
            link, and so on, and you will get my point.
            (p)
            Note that you would also need to gzip your s-expressions if you are
            concerned about size.))
It's really not a lot different. Of course, parens would need to be escaped, but this is no different from needing to escape < and >.

But one of the goals of XML was precisely that it should not rely on specialized software to be able to read and write.

But this is the problem with XML... it does rely on special libraries to validate and parse into reasonable data structures. It requires special heuristics to describe how to recover nicely in the event that markup isn't valid. It requires a document describing exactly what the XML needs to look like.

S-expressions are easier to parse, less verbose, and can accomplish all the same tasks and more, all while being more flexible in general.

What you have done in you example is reinvent XML with round parenthesis instead of pointy brackets. Why is this better? The only difference is that you leave out the redundant end tags, which are there for good reason.

Yes, XML requires a library to parse - so does s-expressions! The reason XML seem more complex than sexprs is that it defines a higher-level syntax e.g. with element/attribute-distinctions. You have reinvented that yourself in you example, so you need a spec for it and you need the parser to support it. Also the rules of encodings and character sets have to be specified (e.g. how do you detect the encoding of a file? Which characters count as whitespace?). You will end up with a spec much like XML, except with round parentheses. (OK, XML is also complex because of DTD's but that is a optional part. If you want something like DTD's for sexprs, again, you have to specify it, and you get something like XML.)

Btw. there is no heuristics for recovery in XML. XML parsers must fail when encountering malformed syntax. This is one of the major (and controversial) differences between XML and HTML.

I appreciate s-expressions as a syntax for a programming language. But code is a very different use case than documents. I wouldn't like to program in XML syntax either! E.g. programs (hopefully) don't have deeply nested structures covering several pages. That is common in documents, hence the importance of the redundant end tag.

I like sexprs for code and data, but for documents they are only simpler if you ignore a lot of real-world issues.

Megginson's essay "All markup ends up looking like XML" compares XML, JSON and s-exp

http://www.megginson.com/blogs/quoderat/2007/01/03/all-marku...

BTW. the HTML is not valid XML so your example is a bit misleading. The P-elements contain the paragraphs rather than delimit them. The XML would be more verbose since it needs end-tags for P:

    <p>Note that you would also need to gzip your 
    s-expressions if you are concerned about size.</p>
The s-expr OTOH would be more confusing, because there isn't a clear distinction between element-name and content:

    (p Note that you would also need to gzip your 
    s-expressions if you are concerned about size.)
You might want to choose a different syntax to make the distinction clearer:

    (p "Note that you would also need to gzip your
    s-expressions if you are concerned about size.")
or:

    ((p) Note that you would also need to gzip your
    s-expressions if you are concerned about size.)
In the end, you have to make some of the same trade-off decisions that the designers of SGML and XML did. Just saying that s-expressions are simpler than XML is like saying ASCII is simpler than s-expressions: True, but kind of missing the point.
Well in that case of using XML as markup (what it was designed for) - it is clearer then the s-exp. the only time I like XML editing is docbook - cause when you end a tag, you never have to bounce back up (which may be more then a screen away) to know what tag you are in.

thats about the only time I like it though.

" Paren-matching and auto-indentation makes writing S-expressions orders of magnitude easier, and at least a constant factor easier than writing XML."

Funny; I find that the tag matching, auto-indentation, and auto-completion in vim make editing XML pretty easy.

Maybe S-expressions are slightly easier. Maybe not.