Hacker News new | ask | show | jobs
by catnaroek 3496 days ago
> Do you get the difference between internal and external representations? When you see (lambda (a) (+ a 1)), you don't know how the code is represented internally and how its value, when executed [emphasis mine], is represented internally

I'm not talking about representations of semantic objects. I'm talking about representations of syntax. In Lisp's case, you don't get to choose - the macro system critically depends on a particular representation (named variables). Have fun writing macros that operate on HOAS.

2 comments

> In Lisp's case, you don't get to choose - the macro system critically depends on a particular representation (named variables)

Please tell me how Haskell represents HOAS and how this is not possible in Lisp.

Also, read Barzilay's paper (http://scheme2006.cs.uchicago.edu/15-barzilay.pdf), or about the Ergo Project (1988! http://repository.cmu.edu/cgi/viewcontent.cgi?article=2763&c...).

> Please tell me how Haskell represents HOAS and how this is not possible in Lisp.

I'm not talking about representing some object language's syntax in Haskell or Lisp. I'm talking about representing Haskell or Lisp's syntax in some other metalanguage (possibly Haskell or Lisp itself). How would you implement a macro expander that operates on HOAS?

> Barzilay's paper (http://scheme2006.cs.uchicago.edu/15-barzilay.pdf)

The object language implemented in that paper isn't the whole of Scheme, and in particular, it doesn't have macros.

> Ergo Project (1988! http://repository.cmu.edu/cgi/viewcontent.cgi?article=2763&c...)

Sorry, I can't find an explicit description of any object language's syntax in this paper.

> How would you implement a macro expander that operates on HOAS?

Hmm, you would expect that HOAS actually be helpful in this area in some ways. Since variables are no longer resolved by matching symbols to binding information in a scoped environment (all we have is direct names references to the binding inserted in the code, or something like that), we don't have hygiene issues when we move code around without having to do any alpha-renaming, or gensyms. It sounds rather convenient, unless we expressly want to do something non-hygienic.

So how would we implement a macro expander? Like we implement anything else: in whatever way that implementation satisfies the specification of macros under HOAS. What is the specification: that is the question. What are HOAS macros, or macros under HOAS?

Not having a formal specification, we might want at least some examples: OK, I have this HOAS artifact, and would like to be able to write it in a more condensed way using this other HOAS: now how to make one into the other? That gives us a function, where we can identify the inputs, which have to be arguments to the macro somehow and so it goes.

Some kind of dual macro system could be useful: a raw source code macro system for doing un-hygienic things, and then a parallel one which kicks in when the code is converted to HOAS.

> It sounds rather convenient, unless we expressly want to do something non-hygienic.

Agreed. I wouldn't want to do anything non-hygienic, but, from what I can tell, most other macro users aren't exactly very fond of everything being strictly hygienic. It's precisely implementing the unhygienic bits that would be unreasonably hard with HOAS.

In what language do we input code as HOAS, and target that representation with macros?

(Can't we expand macros in a conventional representation with symbols and environments and then go to HOAS?)

(It's not news that macros are weak in some sense; I mean you can shoot yourself in the foot with the classic non-hygienic ones; you can disrespect even the first order abstract syntax that you're working in: the macro can re-locate a piece of raw syntax which contains identifier references into the wrong scope where those references resolve otherwise.)

Anyway, have fun fishing for mackerel with your bear trap?

The “conventional” representation of syntax using explicitly named variables is utterly inconvenient and error-prone for anything other than parsing. The very first thing I want to do after parsing is convert the syntax tree to de Bruijn indices/levels or HOAS.
You could expand some conventional macros first, then go to HOAS with what is left (and then do another pass with a separate HOAS macro system, perhaps).

How the heck do you handle hierarchical run-time environments in these representations? Especially mutable ones? If we have this:

  (let (a b c)
     (let (d e f)
        (lambda ())) ;; escapes somehow
     (let (g h i)
       (lambda ()))) ;; ditto
     (lambda ())) ;; ditto
These lambdas capture different, overlapping environments which share the (a b c) bindings. If there is an inner loop which repeatedly instantiates the (d e f), then new closures are made with different instances of these variables: yet those closures all share the same (a b c) bindings.

So we can't just flatten the entire lexical scope to some HOAS terms and pretend that its hierarchical structure doesn't exist.

Mutable shared environments is an absolutely terrible idea. All this achieves is to gratuitously make it unsafe to call in parallel multiple closures created in the same environment. A saner approach is to let variable bindings be immutable, and only then let some variables have (not necessarily static) type “mutable cell” (something like Rust's `Rc` or `Arc`).
Even if environments are immutable, the problem has to be solved somehow that the environments are hierarchical. Newly instantiated versions of a sub-environment share the same super-environment. For instance, a local tail call occurring in the scope of some stable outer bindings, capturing different versions of the tail call inner variables.
I fail to understand why hierarchical environments pose a problem for HOAS. Environments map variables to their values, and they're used to give meaning to non-closed terms (containing variables that aren't bound within the term itself).

The whole point to HOAS is making unbound variables irrepresentable in the first place (which is the source of the accidental variable capture problem), eliminating the need for environments as separate data structures. For example, this is the HOAS representation of closed terms of the untyped lambda calculus:

    datatype term = Lam of term -> term
                  | App of term * term
Note there is no need for a separate constructor for free variables.