| You are confusing syntax and semantics. HTML and the DOM are two different things. HTML is a string of characters (syntax). The DOM is a data structure (semantics). Normally a DOM is produced by parsing HTML, but it can be produced in other ways (by running Javascript code, for example). S-expressions are a data structure, different from the DOM, but S-expression syntax is a syntax. Normally S-expression syntax is parsed to produce S-expressions, but can also be parsed to produce other things. S-expression syntax can be parsed to produce a DOM. The easiest way to do this is to parse S-expression syntax ino S-expressions, render those S-expressions into HTML code, and then use an off-the-shelf HTML parser to parse the HTML. But you could also write a parser that parsed S-expression syntax directly into a DOM if you wanted to. You could also write a transformation program that compiled S-expressions directly into a DOM without going through the intermediate HTML. The answer to your question of how to add an attribute to an implied element is that it is not possible to do that in HTML. It is only possible to add an attribute to an implicit element of the DOM produced by parsing an HTML document that omits that element (because at that point the element is no longer implicit). The exact same thing is possible using S-expressions. For example, here's how you write tables in my library: (:table (header header ...) (data data ...) (data data ...)) This string of characters is parsed by the Lisp reader to produce an S-expression that has a one-to-one correspondence with the string you see above. But then there is an extra processing step that transforms that into a different S-expression whose printed representation is: (:table (:tr (:th header) (:th header) ...) (:tr (:td data) (:td data) ...) ...) At that point you can manipulate that S-expression in the same way that you manipulate the DOM (because they are both just data structures). Once you're done, you convert the S-expression to a DOM. At the moment that is done by rendering to HTML, but as I noted above that is just an implementational convenience to take advantage of the fact that HTML->DOM parsers are available off the shelf. You don't have to do it that way (and indeed the world would be a better place if it were not done that way). All of this is trivial when dealing with S-expressions precisely because of the strict 1-to-1 correspondence between data structure and visual representation that does not exist in SGML-derived languages. That is why writing code for SGML-derived languages using S-expression syntax is so advantageous. (Actually, this is true for any language, not just SGML-derived languages. It's just a little more obvious for SGML-derived languages because SGML syntax already kinda sorta looks like a data structure representation so it's a little easier to grasp what is going on.) |
I believe this is where the confusion is coming from. When you parse HTML syntax, you get a data structure; this is the same as when you read sexpr syntax, you also get a data structure. Both these data structures are different from the DOM tree.
Try this example:
Can CL-WHO generate HTML that matches that? (i.e. feed both into a tool like BeautifulSoup and produce the same data structure?)Outside of CL-WHO and Hiccup-type libraries, you can of course use S-exprs to represent the same data structure. Here's a hypothetical S-expr syntax that might produce the same data structure:
Which is what I believe JimDabell meant by:> you can't represent all valid HTML documents as S-expressions, at least not in the convenient way people assume