Hacker News new | ask | show | jobs
Where Did Combinators Come From? Hunting the Story of Moses Schönfinkel (writings.stephenwolfram.com)
173 points by abiboudis 2019 days ago
14 comments

I know Stephen Wolfram is a bit of a contentious figure, but his write-ups are always absolutely incredible. And if you're wondering why combinators are important (particularly S and K), SKI is a universal formal system[1]. Which is kind of crazy to think about: just two axioms and modus ponens can compute anything (I is just SKK).

[1] http://people.cs.uchicago.edu/~odonnell/Teacher/Lectures/For...

There are actually single combinators which form a complete basis, i.e. from which you can get S and K, and thereby universal computation.

https://en.wikipedia.org/wiki/Combinatory_logic#One-point_ba...

https://en.wikipedia.org/wiki/Iota_and_Jot#Universal_iota

Iota is still defined in terms of S and K. More technically, Iota is an "improper" combinator (see pg. 779 in [1]). Therefore, you can't axiomize Iota, so it would be hard to argue that it's more "fundamental" than S and K.

[1] https://books.google.com/books?id=1xEVkzuX5e0C&pg=PA779&lpg=...

A single point basis (like X = \z. z K S K for which X X = K K, X X X = K, and X (X X) = S) is not in itself a combinator.

A combinator is not just a closed lambda term; it must have all lambdas in leading position. Like S = \x \y \z. x z (y z)

Does this mean that the Y combinator, Y = λf.(λx.f (x x)) (λx.f (x x)), is not a combinator?
> A single point basis [...] is not in itself a combinator.

> A combinator [...] must have all lambdas in leading position.

What's your opinion on:

  ωabcd = bd(cd) if a has no normal form[0][1]
    or  = c      if b has no normal form
    or  = d      if c has no normal form
    or  = dd     otherwise
  where
  SII = ωωωω
  Ω = SII(SII) = ωωωω(ωωωω)
  S = ωΩ = ω(ωωωω(ωωωω))
  K = ωωΩ = ωω(ωωωω(ωωωω))
  I = ωωωΩ = ωωω(ωωωω(ωωωω))
as a single-'combinator' basis?

0: Aka, does not halt.

1: Evaluation diverges if a (or b or c, if queried) is somthing like:

  SII(C(C(B(ωωx(KI)K)(SII))Ω)I)
(from https://en.wikipedia.org/wiki/Combinatory_logic#Undecidabili...) that tries to diagonalize ω.
Such a combinator ω cannot exist.
The article is a deep dive into Moses Schönfinkel's life, and mentions how his work is picked up by Haskell Curry.

It is very long, contains surprisingly few of the characteristic Wolfram promotion usually found in Wolfram texts, but besides a detailed biography and pictures of historic documents, there are a few gems (people who researched "algebra of logic" before most others) for the persistent reader.

> And maybe if the operation we now call currying needs a symbol we should be using the “sha” character Ш from the beginning of Schönfinkel’s name to remind us of a person about whom we know so little, but who planted a seed that gave us so much.

That's a good suggestion!

For folks who are interested in combinators but didn't get what they want from the text: the approachable reference on combinators is Hindley and Sheldon's "Lambda-Calculus and Combinators: an introduction"

... and of course, there is "To mock a mockingbird" by Raymond Smullyan, which introduces the topic through puzzles.

Typo: Hindley and Seldin ... silly phone autocorrect
If video is more your thing, Stephen just wrapped up a long, interesting livestream about combinators, and the life of Moses: https://www.youtube.com/watch?v=PG2G5xSz0NQ
Just watched it live on twitch! There where some great people there for the Q and A.
Really amazing cast: Dana Scott, Phil Wadler, Barry Jay, last student of Curry (whose name I don't remember)... it's really interesting to see all these people together.
There are so, so many rich sidenotes in this post.

One I like:

"Of course, there are confusions. There’s yet another birthdate for Schönfinkel: September 4, 1889. Wrong year. Perhaps wrongly done correction from the Julian calendar."

Here we are over 100 years later and still dealing with constant time and calendar bugs.

I've given an 101 minutes trying really hard to grok this... everything I've found skips the very basic definition of what S and K are for some astoundingly dumb reason.

Stephen Wolfram obviously knows what he's talking about... but lacks Feynman's ability to put it into layman's terms.

My suspicion is that there is an implied list, and if you go off the end of the list, you get a hidden zero... if not, you get a one.

The SKI combinator Wikipedia page has a section about Boolean logic that comes close to making sense to me.

Please let me know if I'm getting this right.... restating the page, putting it into familiar notation

Booleans are a list of [1,0]

T(x,y) = x

F(x,y) = y

Applying this over Booleans

T(1,0) --> 1

F(1,0) --> 0

They then go to explain that

T(x,y) = K combinator

Then they go on to explain

False(x,y) = SK, which I think is K(S(Boolean)

It's at this point, I'm lost.

Is SK --> S(K(xy)) or K(S(xy))

I can't resolve that to get any further.

This is what S and K (and I) are:

    S := λx.λy.λz.x z (y z)
    K := λx.λy.x
    I := λx.x
There's nothing really more to it. What's weird/amazing is that just with these three (technically, just S and K, as I can be derived from them), you can compute anything.
I still don't understand the way this breaks down...

How many parameters do S,K,or I take?

If x=11, y=12, z=13 what are S,K? I assume I = 11

This is more of a λ-Calculus question (unrelated to combinators, really). But S takes 3, K takes 2, and I takes 1. This isn't strictly correct because S/K/I isn't a function, but rather a chain of lambdas with no free variables (and we get the number of "parameters the function takes" by counting the lambda symbols). See §1 in [1]. In your specific case,

    S 11 12 = 12
    K 11 12 13 = 11 13 (12 13)
    I 11 = 11
Keep in mind that the literals "11," "12," and "13" in λ-Calculus aren't numbers.

[1] https://personal.utdallas.edu/~gupta/courses/apl/lambda.pdf

If I understand this right...

S takes 2 values, and returns 1

K takes 3 values, and returns the 1st, 3rd, and (2nd applied to 3rd)

I returns the value it is given (like a no-op)

> K takes 3 values, and returns the 1st, 3rd, and (2nd applied to 3rd)

Everything applies: so, e.g., when we see something like ABC, we actually omit the parentheses; if we include them, we have (((A)B)C). This can be confusing because applications associate to the left, but lambdas associate to the right[1].

So 11 13 (12 13) is actually (((11)13)((12)13)). In roundabout English, 11 applies to 13 which then applies to the result of 12 applies to 13.

[1] https://www.cs.cmu.edu/~anupamg/251-notes/lambda_calculus.pd...

Try this video of you haven't already https://youtu.be/pAnLQ9jwN-E
I did... thanks for the link. I also went back and started watching the first part.
In the base definition are no numbers involved. Just S K and I, with these rewrite rules:

    S x y z = x z (x z)
    K x y = x
    I x = x
So, you can think of the whole thing as a list of S K I and parenthesized sequences of S, K, and I.

If you see S up front, you take the next three things (call then x y and z) and put them back on in this order x y (x z). x, y, and z are either a letter or a parenthesized sequence of letters and parentheses.

Note that S is the only combinator that makes the list longer.

If you want to think of them as functions they need to be _curried_. So they take one argument at a time and return a new function that takes the next argument. In javascript, if "(x, y) => x + y" were your function, the curried version would be: "x => y => x + y".

Expressed in javascript, our combinators are like: (but lazily evaluated)

    let S = x => y => z => x(z)(y(z))
    let K = x => y => x
    let I = x => x
Note for example that (K x) would be a new function that takes one more value, throws it away and returns x. K is sometimes called the constant function.

True / False are represented by building an expression that returns either the first thing that comes after it, or the second thing. So True/False are in essence their own if statements:

    T t e = t
    F t e = e
(I'll call the variables t and e for "then" and "else".)

So T = K, and SK will do what F is doing above, we apply it to t e and expand using the rules above to see:

    S K t e = K e (t e) = e
You can think of this as just applying the rewrite rules above, but it sounds like you want to also think of it in terms of functions. In which case, in the javascript notation you're looking at:

    S(K)(t)(e)
S is applied to K to get a new function that takes two more arguments, above we wrote:

    S = x => y => z => x(z)(y(z))
so

    S(K) is y => z => K(z)(K(y(z))
Applying this to 't', we get

    S(K)(t) is z => K(z)(K(t(z))
And then applying this to e:

    S(K)(t)(e) is K(e)(K(t(e)))
Now K(e) is (anything) => e, so the K(t(e)) is just thrown away.

This is where trying to express it in javascript gets a little wiggy. The t(e) here might not make sense. The K(e) function throws away the K(t(e)) but in javascript all of the arguments are evaluated before handing them to the function, so it would evaluate t(e) and then K(t(e)) before handing it K(e), where it is then ignored.

In the SKI calculus, recursion relies on this laziness (or you'd loop forever going down a path that would be thrown away).

Numbers can be expressed as functions using _Church encoding_.

  Zero f x = x
  Succ num f x = f (num f x)
Here, Zero is 0, Succ Zero is 1, Succ (Succ Zero) is 2, etc. If you expand this out, it looks like:

  Zero f x = x
  One f x = f(x)
  Two f x = f(f(x))
  ...
So a number ends up being something that takes a function and a value and calls the function on the value that number of times. There are ways to do addition, subtraction, multiplication, etc.

If you want to go deep down this rabbit hole, there was an IOCCC entry that was a simplified version of Haskell that compiled down to the SKI calculus - and it could compile itself. The process of developing it is written up here:

   https://crypto.stanford.edu/~blynn/compiler/
It's pretty dense stuff, but fascinating.
Elsewhere in the thread, was a YouTube link showing demos of church numbers, all in javascript. It was interesting, and there was a function to take church numbers and convert them back to normal whole numbers. I get the big picture, but am hung up on a detail or two.

>S K t e = K e (t e) = e

S K t e = K e (t e) // that makes sense

What is (t e)? I got to the same point in the python example down thread.

Yeah, I tried to mention that near the end, but it can be hard to explain.

It is "t" applied to "e" - so it would be t(e) in javascript notation. This could be nonsensical, but because of lazy evaluation it never gets expanded/evaluated. (K doesn't use its second argument.)

The SKI stuff is defined as applying those replacement rules from the outside in, so:

    K e (t e) ... ->  e ...
you take that whole expression from the front (K and the next two expressions) and replace it with the expression that was in the first argument's location. But (t e) here is dropped without ever being expanded / evaluated.

Most languages fully evaluate arguments before passing them to functions, but a few rare languages like Haskell don't evaluate the arguments until they're needed. So for

    K e (t e)
it will assign the whole tree (t e) wherever y occurs in the function definition, and just keep passing it along until something needs its value (if you use apply it to something like a function, use it in a case statement, or pass it to a primitive function like plus).

The miranda language (whose source was recently published) is also lazy, and works by compiling your code down to SKI combinators and then evaluating them. So this actually has been used in practice.

That's a fascinating piece, although IMO Wolfram could really do with an editor. While I realise that the article contains serious/important information (for mathematicians, historians, comp scientists etc), this bit resonated the most with me:

Göttingen was at the time a top place for mathematics. In fact, it was a sufficient “math town” that around that time postcards of local mathematicians were for sale there.

Utterly charming.

I really enjoy the length of his pieces, because they are so full of visuals. Sometimes I just look at the diagrams and have "aha moments". Reading one of his posts is like going to the Loove (sp?).
Louvre :)
I can really recommend a book called „To Mock a Mockingbird“ on this topic. Full of deep dives, cool puzzles and such.
I suspect Smullyan was one of those people who achieved enlightenment by pure reason.

Consider "Planet Without Laughter" (note this from the site of Knuth-- that Knuth --and presumably he hosts it there for a reason, so pay attention kids!)

https://www-cs-faculty.stanford.edu/~knuth/smullyan.html

Also he looks like Gandalf! :)

Both Smullyan and Feynmann were born in the immigrant Jewish community of Far Rockaway, NY, almost exactly one year apart. Smullyan was one year younger than Feynmann, but lived 30 years longer.

https://en.wikipedia.org/wiki/Far_Rockaway,_Queens#Notable_p...

It seems Smullyan moved to Manhattan and went to high school in the Bronx, so they would not have been at Far Rockaway High School together.

Now what does it say about me if I read this whole thing and didn't find it all that funny?
There's no accounting for taste. I'm ashamed to admit my father didn't find the works of Douglas Adams funny. I loved him anyway.

In any event, I don't think the piece is meant as a "ha ha" comedy. It's more of a contemplation or a work of philosophy.

Did it make you smile at any point?

Yes it did make me smile. My response was only half serious.

The moral I extracted was that free will is fake and humor is all that matters. If that's the case I'd have hoped he'd have thrown a few more jokes in ;)

:)
That you're not Knuth, but I suspect you knew that already.

If not, um... retroactive spoiler warning? Sorry.

Say, have you heard of that book, The Three Knuths of Ypsilanti?

I guess Raganwald's blog entry "To Grock a Mockinbird" is a reference to that book?

http://raganwald.com/2018/08/30/to-grok-a-mockingbird.html

Yes :)
I thoroughly enjoyed this story, and Wolfram's thorough detective work For those interested, Schönfinkels article (über die Bausteine der mathematischen Logik) is available online [1]. A good writeup about its significance can be found in the Stanford Encyclopedia of Philosophy [2]

[1] http://www.cip.ifi.lmu.de/~langeh/test/1924%20-%20Schoenfink...

[2]https://plato.stanford.edu/entries/logic-combinatory/#SchoEl...

For those who are getting confused about the S and K combinators, another definition is implied by the notion of a 'partial combinatory algebra' [1]. It is a set A together with a partial map A x A -> A, viewed as 'application'. You then require the S and K combinators to exist somewhere in A (they are non-unique) and you can prove things like the recursion theorem.

[1] https://ncatlab.org/nlab/show/partial+combinatory+algebra

I wish Wolfram used help of a Russian-speaking editor, but the material makes up for the small errors he makes.
There are now some updates with improvements to the Russian-language bits.

https://writings.stephenwolfram.com/2020/12/where-did-combin...

Why did Y Combinator the company choose their name?

https://en.wikipedia.org/wiki/Fixed-point_combinator#Y_combi...

Related: There exists programming language Unlambda https://en.wikipedia.org/wiki/Unlambda which is based on combinatory logic in which "Variables are unsupported".

From a programmer's viewpoint it seems like a daunting task: Write all your programs without using variables. Can that work in practice?

Funnily, I do indeed know the name from a footnote that mentions he is the original inventor of currying.
What really amazes me is the favourable response of people. In hindsight, these are folks that either helped wolfram, interacted w him in the past or worked for him.

On yet another tombstone, I wonder how many people internally within the wolfram dark net, touched this up Before it got published.

The so called gem notes, are more often than not, not his own discoveries aka universality of rule 110. Truth be told I hear many other voices in this piece that have not been mentioned or acknowledged.