|
|
|
|
|
by swhipple
3421 days ago
|
|
> HTML is a string of characters (syntax). The DOM is a data structure (semantics). [...] S-expressions are a data structure, different from the DOM, but S-expression syntax is a syntax. I believe this is where the confusion is coming from. When you parse HTML syntax, you get a data structure; this is the same as when you read sexpr syntax, you also get a data structure. Both these data structures are different from the DOM tree. Try this example: <pre>
<span>one
</span>
<br>
<span>two</span>
<br />
</pre>
Can CL-WHO generate HTML that matches that? (i.e. feed both into a tool like BeautifulSoup and produce the same data structure?)Outside of CL-WHO and Hiccup-type libraries, you can of course use S-exprs to represent the same data structure. Here's a hypothetical S-expr syntax that might produce the same data structure: ((pre)
"\n " (span) "one\n " (/span)
"\n " (br)
"\n " (span) "two" (/span)
"\n " (br/) "\n"
(/pre))
Which is what I believe JimDabell meant by:> you can't represent all valid HTML documents as S-expressions, at least not in the convenient way people assume |
|
In the case of S-expressions that is true. In the case of HTML it may or may not be true. It depends on how the HTML parser is implemented. There is a "natural" mapping of HTML onto a parse tree that is different from the DOM, but that is not part of the standard (AFAIK).
> Can CL-WHO generate HTML that matches that?
Yes, though native Common Lisp does not provide c-like string escapes so putting in newlines is a little awkward. You could, of course, bring in a string interpolation library, but here's how you can do it without that:
Or you could do this: which looks like cheating but is actually closer to the spirit of the original.The PRE tag is really weird because it actually changes the way things inside it are parsed. You can actually implement that in Lisp too via reader macros. CL-WHO doesn't support that out of the box, but it's not hard.
I can't imagine anyone actually wanting to do that, though. The PRE tag is for presenting pre-formatted text without changing its appearance, so embedding other tags inside it is kinda perverse. [EDIT: I was wrong about this. See below.]