Hacker News new | ask | show | jobs
by lispm 2453 days ago
Lisp macros are not based on an AST. The macro forms in Lisp need to be valid s-expressions and begin with a macro operator. That's it.
1 comments

Still, isn't it AST-based in the sense that the input of a Lisp macro is effectively a parsed syntax tree, just as in Nim?
The only restriction is that it is a s-expression: a possibly nested list of data. There is no particular programming language syntax it is parsed against - it's parsed as data.

For example this is a valid Lisp macro form

    (loop for i below 10 and j downfrom 20
          when (= (+ i j) 9) sum i into isum of-type integer
          finally (return (+ j isum)))

    It returns 10.
The LOOP macro parses it on its own - it just sees a list of data. Lisp has other than that no idea what the tokens mean and what syntax the LOOP macro accepts.

    CL-USER 142 > (defmacro my-macro (&rest stuff)
                     t)
    MY-MACRO

    CL-USER 143 > (my-macro we are going to Europe and have a good time)
    T
Works.
The data has an AST. The AST is made of conses, and atoms. The atoms have a large variety of types. They are not "tokens", but objects. Tokens only exist briefly inside the Lisp reader. 123 and |123| are tokens; one converts to an integer node, one to a symbol node. (1 (2 3)) has an AST which is

    CONS
   /      \
  FIXNUM   CONS
  |        /   \
  1      CONS   SYMBOL
        /     \     \ 
      FIXNUM   CONS  NIL
        |      /   \
        2   FIXNUM  SYMBOL
              |        |
              3        NIL
> The data has an AST.

(not (eq 'has 'is))

But for an Abstract Syntax Tree for code we have more categories: function, operator, call, control structure, variable, class, ...

That kind of tree is stipulated a particular form of compiler (well, parser) writing dogma revolving around C++/Java style OOP. You set up classes with inheritance representing various node types; everything has methods, and "visitors" walk the tree, and various nonsense like that.

If our code walker dispatches on pattern matches on the nested list structure of conses and atoms, we don't need that sort of encapsulated data structuring. The shapes of the patterns are de facto the higher level AST nodes. The code walking pattern case that recognizes (if test [then [else]]) is in fact working with an "if node". That AST node isn't defined in a rigid structure in the underlying data, but it's defined in the pattern that is applied to it which imposes a schema on the data.

If that's not an AST node, that's like saying that (1 95000) isn't an employee record; only #S(employee id 1 salary 95000) is an employee record because it has a proper type with a name, and named fields, whereas (1 95000) "could be anything".

Syntax trees are no nonsense and go way back before Java and visitors were hip.

The code walker is just another parser. It needs to know the Lisp syntax. It needs to know which parts of a LET form is a binding list, what a binding list looks like, it needs to know where declarations are and where the code body ist. It can then recognize variables, declarations, calls, literal data, etc, It needs to know the scope of the variables etc. Nothing of that is encoded in the LET form (since it is no AST), and needs to be determined by the code walker. Actually that's one of the most important uses: finding out what things are in the source code. Lisp does not offer us that information. That's why we need an additional tool. A code walker may or may not construct an AST.

No, (1 950000) is not an employee record. Only your interpretation makes it one. Other than that our default Lisp interpretation based on s-expressions: it's a list of two numbers. In terms of the machine it's cons cells, numbers, nil. Without further context, it has no further meaning.

Fair enough I guess. If the Lisp macro system only sees parsed s-exps, in principle you still need to figure out for yourself what kind of construct you are dealing with on a higher level of abstraction. Nim's macro system seems to be operating on a higher level of abstraction in that regard.