Hacker News new | ask | show | jobs
by wschroed 3708 days ago
I have only ever heard the opposing argument, but I am interested in understanding to its fullest extent how to implement good DSLs with macros. I understand CL and defmacro; I am just looking for good patterns of use. Can you point me to any resources, documentation, or examples of this?
3 comments

When you design a eDSL out of functions you get a lot of expressive power for free because you can lean on the host language's composition facilities. However this also means you have to shoehorn your semantics into an existing set of constraints (this is addressed by the "If possible" qualifier in "If possible, do things without macros") and you have to adopt all the complexity that the host language enables.

One big advantage of a DSL is that it limits complexity so that it's easier to predict what code will do, even if you haven't read it yet. With a macro-based DSL, you can know exactly what functionalty expressed in the DSL can and cannot do, which makes it easier to reason about without studying the code for hours.

so, i'm not a CL wizard, but i've spent some time with it. One cool part about macros, the part that really matters, you get to control the order of evaluation. That doesn't sound super interesting, so let's get a little more concrete.

Say you want to model option contracts. They can get crazy complicated. most DSLs will have things that are AND like and things that are OR like. In C you're stuck with something like this for the and case:

    all = 1;
    for(int i = 0; i < clause_count; i++){
        if(eval(clause[i]) == 0){
            all = 0;   
            break;
        }
    }
    return all;
The key point here is the break. we actually want to stop evaluation of the later parts of the code.

with a macro, you just expand out all of the eval'ing of the clauses in place. you can't just map over the list of clauses, because we want the early exit.

my lisp is rusty, but i think it'd look kinda like

    (defmacro optionand (&rest clauses) 
        `(and ,@(map (lambda (c) (list 'eval c) clauses)))
the gist being, we leverage the built in and to do something like:

    (and
        (eval google-at-700)
        (eval oil-at-50))
so you embed right in the language itself, taking advantage of the built in short circuiting of and.

in haskell, you can play a very cute game. since functions are lazy, you get full control over evaluation - you don't need a special form to implement if, for example.

So, the very cute trick is to make your dsl out of ordinary functions, that use typeclasses for implementation. in the above case eval would be a method in a typeclass. (you can do the very same thing in lisp, it just seems like a bit more effort to carry around the specific implementation of eval, haskell will just figure it out with the typechecker).

The thing that's great about the trick, you can write multiple implementations of eval, and not rewrite all your contracts. One implementation could just straight up evaluate it. Another version could do some fuzzier math with variance of the underlying price, and return a gaussian rather than a number.

Again, the key is controling the actual evaluation i think there's some bits and pieces about it in _on lisp_ which i think pg made available online. haskell's bracket function is a great example and would be a nice macro candidate

    (bracket obtain-resource do-stuff release-resource)
    (bracket get-socket talk close-socket) ; for example
no matter what happens in do-stuff, release is always called.

it's possible (even easy?) to do scary stuff with controlling the order of evaluation, but when you see code with those complicated breaks and continues, it's often a side effect of not having good macros.

edit

i learned the haskell trick from Oleg Kiselyov's papers - http://okmij.org/ftp/tagless-final/ He uses it to make an interpreter into a compiler, and very clever stuff with partial evaluation. That guy is really smart.

Although pg doesn't call out DSL's specifically, the macro chapter is good. http://unintelligible.org/onlisp/onlisp.html#SEC49 and indeed you can get the whole book here http://www.paulgraham.com/onlisp.html

All the arguments against the CL-style macros are stemming from the fact that it's far too easy to get into a total mess with them, simply because of their infinite flexibility. Yet, if you follow some very simple rules, you'll get the opposite.

Firstly, macros must be simple and must essentially correspond to compiler passes.

E.g., if you're implementing an embedded DSL for regular expressions, one macro expansion pass should lower your DSL down to a syntax-expanded regular expression, the second macro would lower it into an NFA, another one would construct an optimised DFA, and then the next one (or more) would generate the actual automaton code in Lisp out of the DFA.

The best possible case is when your macro pass can be expressed as a set of simple term rewriting rules. And most of the transforms you'd find in DSL compilers can actually be represented as such.

Of course, there is also an alternative style, which may be preferable if you have debugging tools for designing your compiler passes at least equivalent to the Lisp macro expansion debugging. You can simply write your entire DSL compiler as a single function, and then wrap it into a macro, as `(defmacro mydsl (&rest args) (mydsl-compiler args))`.

This way, the same compiler infrastructure can be reused for an interpreted or partially interpreted version of your DSL, if you need it. Still, all the same rules apply to how the passes are implemented in that compiler function.

Another very useful trick is to make your DSLs composable, which is easy if you split your compilation pipeline into separate macro expansion passes. Multiple DSLs may easily share features of their internal IRs or even front-end languages, so any new DSL would simply cherry-pick features from a number of existing DSLs and wrap them into a simple front-end. This degree of composability is simply impossible with the function-based interpreted eDSLs.

A can recommend taking a look at the Nanopass [1] framework, which is built around this very ideology, or at my own DSL construction framework [2].

[1] http://andykeep.com/pubs/np-preprint.pdf

And some examples:

https://github.com/cisco/ChezScheme

https://github.com/eholk/harlan

[2] https://github.com/combinatorylogic/mbase