Hacker News new | ask | show | jobs
by kazinator 1367 days ago
LC is only computationally foundational; it describes recursive functions. It's just another universal turing machine.

LC has a great disadvantage: it's difficult to write a LC interpreter in LC. This project shows exactly what that means. To write an LC interpreter, you need a data structure for representing expressions. You need a symbolic data type.

LC does not know what a LC expression is. Papers about LC know what that is, but they are not executable.

In Lisp we can say, okay, lambda calculus can be represented sort of like this:

  (lambda (x) x)
and so on. That's a nested list. It contains symbols. We can use an assoc list to associate symbols with the terms that are their values. And so on ...

Lisp has the programmatic vocabulary to talk about lambda calculus formally, in a way that is executable.

I don't suspect there is a significantly easier way for LC to interpret LC than to use the 42 page expression to create a Lisp, and then write the interpreter in that lisp.

4 comments

There is. Using the Mogensen–Scott encoding, a self-interpreter can be written in lambda calculus as

(λf.ff)(λf.λt.t(λx.x)(λm.λn.ffm(ffn))(λm.λv.ff(mv))),

which is a direct translation of the equivalent Haskell code

    data Term t = Var t | App (Term t) (Term t) | Abs (t -> Term t)
    newtype Function = Function {apply :: Function -> Function}

    interpret :: Term Function -> Function
    interpret (Var x) = x
    interpret (App m n) = apply (interpret m) (interpret n)
    interpret (Abs m) = Function (\v -> interpret (m v))
https://en.wikipedia.org/wiki/Mogensen%E2%80%93Scott_encodin...
I don't see what in the interpreter converts the lambda calculus into the Morgensen-Scott encoding.

The Wikipedia page describes a "mse" function that is in some meta-language which is not lambda calculus. So first wee need a Lambda Calculus based interpreter for the meta-language, which can run this "mse" function.

It looks like mse[x] is supposed to match a variable term, and mse[M N] matches a function application and so on. There are no such concepts and representations in lambda calculus, not to mention shape matching on them.

The meta language might as well just be a paragraph of English: instructions on how to hand-compile the lambda calculus into a bunch of thunks which the interpreter can just invoke in certain ways to bring about the evaluation.

The situation with Lisp is exactly the same. To run a Lisp self-interpreter, we don’t pass it a Lisp function:

    (interpret (lambda (x) x))
but rather an encoded version of that Lisp function’s code:

    (interpret (cons 'lambda (cons (cons 'x nil) (cons 'x nil)))
Of course, Lisp gives us a more convenient syntax for the latter, in the form of the quote macro:

    (interpret (quote (lambda (x) x)))

    (interpret '(lambda (x) x))
But the quote macro is not a function; it’s just syntax. If it were a function, you’d expect this to be equivalent:

    (interpret
      (let ((f (lambda (x) x)))
        (quote f)))
which of course it is not.

Although the quote macro is an important part of what makes Lisp Lisp, it’s not a fundamental part of what makes Lisp a programming language. We could write any Lisp program without it (assuming we were still given a way to build a primitive 'symbol).

> The situation with Lisp is exactly the same.

No it isn't, because the Lisp code is already understood to have an encoding. So we don't have to play any Gödel-numbering-like games to get the code to be able to talk about code. That battery is included.

> gives us a more convenient syntax for the latter, in the form of the quote macro

The ' in (cons 'lambda ...) is an instance of quote!

You must write (cons (intern "lambda") ...) to remove quote. Oops, now you're using a different kind of quote: a string literal quote. If you remove that, you will have character literals to otherwise build the symbol name.

I agree that quote is not essential: take out quote and you can still do useful symbolic processing. Just doing interactive testing and writing unit tests will be inconvenient, mainly.

The requirement for quote has a different effect. If we have quote, we can make the additional step in the documentation that all code has the representation produced by quote, even when quote is not being used. When lambda is seen in code, that is actually the same thing that (quote lambda) produces or that (intern "lambda") produces.

The above is almost inescapable if user-defined macros are supported. When code is read, it is not determined at read time what is a macro and what isn't. Therefore it is not known what parts of the form may need to be passed to a user-defined expander function without having been evaluated (and thus in the quote representation). The whole thing is in the quoted encoding, so that quote doesn't have to do anything other than pass through its interior.

Yes, I was clear above that I know ' means quote. My experiment is to compare Lisp to a restriction of Lisp where quote only works on symbols. This restriction doesn’t make it any harder to write a self-interpreter.

> If we have quote, we can make the additional step in the documentation that all code has the representation produced by quote, even when quote is not being used. When lambda is seen in code, that is actually the same thing that (quote lambda) produces or that (intern "lambda") produces.

I’m not sure what you’re suggesting here. Certainly the number (+ 2 2) must be distinguishable from the list '(+ 2 2). Even lambda expressions must be distinguishable from their quoted encodings, because otherwise lexical scoping breaks. If you make (lambda () x) equivalent to '(lambda () x), then this breaks:

    (let ((x 1)) (funcall (let ((x 2)) (lambda () x))))
and if you make (lambda () x) equivalent to `(lambda () ',x), then this breaks:

    (let ((x 1)) (funcall (lambda () (let ((x 2)) x))))
Macros expand at compile time, not runtime. Macros are also cool, but not fundamental, and don’t contribute to the ease of writing a self-interpreter.
We can analyze lexical scoping through quote. Here is TXR Lisp doing it:

  1> (defmacro free-vars (expr :env e)
       (if-match (quote @qexp) expr  ;; unwrap quote if it occurs
         (set expr qexp))
       (tree-bind (expansion free-vars free-funs . rest) (expand-with-free-refs expr e)
         ^(quote ,free-vars)))
  free-vars
  2> (let ((a 42)) (free-vars '(lambda () (+ a b))))
  (b)
Roughly speaking, this is the basis for doing things like back-referencing against an existing lexical variable in pattern matching:

  3> (let ((x 1))
       (match (@x @y) '(1 2) y))
  2
  4> (let ((x 1))
       (match (@x @y) '(2 2) y))
  ** match: (@x @y) failed to match object (2 2)
match is nothing but a macro; application code could supply an equivalent, depending only on what is publicly documented.

Syntactically, the variable terms in (@x @y) are in the same category. Yet x is interpreted by the pattern matcher as a bound variable, whose value matches the corresponding object; whereas y is interpreted as a free variable to be lexically bound to the corresponding object.

You just need a macro system with environment parameters, and a defined API into them.

With help from the macro system, we could write an interpreter such that (let ((a 1) (b 2)) (interpret '(list a b))) will yield (1 2). interpret can't be a function; it has to be a macro which analyzes the argument for variables and creates a bridge between those variables and the surrounding lexicals at the point where interpret finds itself. The interpret macro transforms the code somehow and then hands it to an interpreter function.

If we don't have quote for expressions but only for symbols, we can still write the interpreter function, and give it a test case by writing an expression which calculates the code that we want to interpret. Even without quote the language has given us a representation of the syntax that we can rely on.

We have it as a given that (list 'lambda (list 'x) 'x) produces (lambda (x) x).

We do not have such a thing in lambda calculus. We could create an extended lambda calculus which has it.

You're claiming that LC cannot implement a quoting operator. Which is quite wrong.

What you misunderstand is that a LC quote would not work on arbitrary lambda terms.

A Mogensen quote operator would take a Mogensen encoding, and output a Mogensen encoding of that Mogensen encoding.

Or a BLC quote operator would take a bitstring like 0010 which encodes the identity function λ 1, and output the blc encoding of the nil-terminated list of 4 booleans that represents that bitstring:

01000101100000110010110000011001011000001001011000001100101100000100000100000000101101110110

> would take a Mogensen encoding

obtained where?

> output a Mogensen encoding of that Mogensen encoding

That's not what a quote operator does; it does precisely nothing, yielding the argument formula without evaluating it. No encoding-of-encoding. Just the encoding.

> output the blc encoding of the nil-terminated list of 4 booleans that represents that bitstring

Where/how does that become λ 1 again?

> That's not what a quote operator does; it does precisely nothing,

This is what gives Lisp murky semantics; you need something (quote) to do nothing, while having nothing (no quote) does something (evaluate).

Lisp lacks referential transparency, even without the use of variables. An evaluated term can be evaluated again, yielding something different.

> Where/how does that become λ 1 again?

By decoding it, which is what mostly what the LC self-interpreter does, as detailed on pages 6,7 of [1].

[1] https://tromp.github.io/cl/LC.pdf

> An evaluated term can be evaluated again, yielding something different.

Yes, and a Mogensen-Scott encoding can be Mogensen-Scott-encoded again, requiring two rounds of decoding, and so on.

Multiple rounds of encoding and evaluation seem inescapable of you have the entanglement of homoiconicity.

The quote operator in Lisp is designed exactly right.

In mathematics there are literals like the number 3 or the set {}. These objects stand for themselves and are not understood as requiring any calculation: they just are.

Symbols like x do not stand for themselves. If you want to talk about x literally as the symbol object, it is inescapable that there is some quoting operator to indicate that the usual semantics of x denoting something else do not apply. (Oxford's A Dictionary of Computer Science has a definition of literal which acknowledges this very issue.)

Literals being constants, it means that when they are concretely implemented in a computer, in the best possible way, they do nothing other than trivially reproduce a canned value that already exists before the program starts.

A quote operator that doesn't work on arbitrary lambda terms is not correct. It doesn't meet the definition of quoting.
> LC has a great disadvantage: it's difficult to write a LC interpreter in LC.

It's not; here it is quoted from [1]:

    (λ 1 1) (λ λ λ 1 (λ λ λ λ 3 (λ 5 (3 (λ 2 (3 (λ λ 3 (λ 1 2 3))) (4 (λ 4 (λ 3 1 (2 1)))))) (1 (2 (λ 1 2)) (λ 4 (λ 4 (λ 2 (1 4))) 5)))) (3 3) 2) (λ 1 ((λ 1 1) (λ 1 1)))
> LC does not know what a LC expression is.

Again, it's trivial to encode LC terms as bitstrings [1], which are trivially decoded back into lambda terms. That is exactly what the above lambda expression does. Alternatively, you can encode LC terms with Mogensen's encoding, but that one doesn't give you a textual representation like a bitstring.

[1] https://tromp.github.io/cl/Binary_lambda_calculus.html#Lambd...

> BLC requires translating bitstrings into lambda terms, to which the machine (itself a lambda term) can be readily applied.

I.e. a man behind the curtain is required to complete the "interpreter".

Can you explain at a high level what you think a valid LC-in-LC interpreter looks like then? Because it's very hard to understand what your objection is to the examples in this thread.
A LC interpreter (whether in LC or anything else) has to process the actual syntax with the lambdas and symbols (or integer literals, in the case of De Bruin).

If it is necessary to translate that representation to something else, the interpreter must do that itself.

You can take a LC expression and write some other LC expression which encodes it; but if that is done outside of LC in some meta-language, then that has to be counted as part of the interpreter's implementation, which means that it's not in LC any more.

Based on that, I think it's not actually possible without extensions to LC.

You need to be able to read the syntax with the lambda symbols and names (or integer constants in the case of De Bruin), so I/O is needed. Or else, if that is unacceptable, you need to be able to embed the representation of the code as a quoted literal.

Without I/O or quoting, you have no way to express the test cases.

The basic tools required for implementing a LC interpreter in LC is demonstrated in LambdaLisp. Although LambdaLisp implements Lisp, the same tools can be used to implement a LC interpreter in LC.

The largest concern in question here may be how I/O is done. I/O can be done by viewing a lambda term as a function that takes a string as an input and outputs a string. Here, a "string" can be expressed as a "list" of "characters", where lists can be encoded using the Scott encoding, and characters can be encoded as a list of bits, and "bits" can be encoded as 0=\x.\y.x, 1=\x.\y.y. This I/O strategy is used in LambdaLisp as well, and is explained in my post in [1].

How Lisp terms are encoded as lambda terms in LambaLisp is explained in [2].

To implement a LC interpreter instead of a Lisp interpreter in LC, it then remains to somehow encode and quote a lambda calculus term into a lambda calculus term. The function described by tromp's reply in this thread, called the Universal Machine, does this - it does so by using an encoding called the binary lambda calculus (BLC) notation for encoding lambda terms into lambda terms. In this notation, an arbitrary lambda term in the usual plaintext notation such as \x.\y.\z.(x y z) can be encoded into a bitstream. The converse is also always possible. The Universal Machine uses the BLC notation for expressing lambda terms, with the same 0/1 bit encoding and list encoding mentioned earlier, and outputs a bitstream (encoded as lambda terms) representing the evaluation result. The BLC notation is described in my post in [3].

[1] https://woodrush.github.io/blog/lambdalisp.html#handling-io-...

[2] https://woodrush.github.io/blog/lambdalisp.html#basic-data-s...

[3] https://woodrush.github.io/blog/lambdalisp.html#the-binary-l...

Statements like "IO can be done [in lambda calculus] by ..." really mean "lambda calculus doesn't have IO but can be extended by ...".

Here's what it means to have an LC in LC interpreter. Firstly pick what LC means. Choose a representation for it. Exactly one. Show that this is lambda calculus with no extensions. Then show how an embedded expression of this can be interpreted.

For instance supposed we have a lambda calculus which is based on a three letters M I and U. Spaces are insignificant.

Say that MUIMUIMMU is a valid expression. Then a self-interpreting situation might look like this:

  IMUMIUMM...MUI MUIMUIMMU UMI
The to-be-interpreted expression appears embedded. I set it off with spaces for clarity. It is not translated into any different representation. The surrounding material represents the interpreter. There needn't be an epilogue piece after the embedded expression. Important: no part of this interpreter depends on the embedded expression. We can swap in arbitrary other expressions without changing the interpreter; will correctly interpret those expressions.

I am skeptical that this is possible with standard, unextended lambda calculus, whether regular or de Bruin.

The objection to them is that they are trivial and lead to none of the profound truth that is usually implied by their presentation.
> it's difficult to write a LC interpreter in LC.

Why? It's pretty easy to write Lisp in LC (a simple Lisp, not the feature-full version described in TFA) and it's pretty easy to write an LC interpreter in Lisp. I'll bet that an LC interpreter in LC could be done in only a few hundred lambda terms.

Good point; a minimal Lisp needed for processing LC doesn't require the 42 page formula. That has features like mutable global variables.
> an LC interpreter in LC could be done in only a few hundred lambda terms.

Make that a few dozen...

Yeah, I should have realized that. In fact, I had seen your BLC interpreter some time ago but for some reason it didn't come to mind as I was writing that. I must be getting old.
lisper and kazinator need to put down the keyboard and spare some time for reflection. thanks tromp for your tireless efforts at education. big fan of your work!
Perhaps LC is telling us that maybe macros weren't such a good idea after all??