Funny, I was just thinking about that very thing when I noticed your post. Problem is, I'm a little "ashamed" to release the code right now because I'm in the middle of some major revisions:
1. Use DeBruin notation for referring to bound lambda values positionally, instead of my current approach of actually using the lambda symbol in an associative list. (Don't laugh, it works.)
2. Change the C implementation so it just uses an array of longs for the whole working space, with integer offsets, instead of node structs with pointers. At that point the whole implementation will be "nothin' but int", and avoid malloc altogether.
3. Change both the Perl and C implementations so that I supply external function definitions as function pointers. Right now when a high-order function is reduced to a normal form (lambda form), I actually look at the name of the external function being called and do a big if-else on that. (Don't laugh, it works.) This will no longer be an option when I go with DeBruin notation. Besides, the function pointer approach will be a lot more solid, serious, and extensible.
Thanks for expressing an interest. When I'm a little more proud of my code and have something available for download, I'll announce it on HN.
This is probably a stupid question. If you have Fexl written in Perl now, what are the chances of just throwing a sand-boxed interpreter online for people to play with? I'm thinking just a form with an input field and a submit button which runs the script and displays any printed values.
It's a brilliant question, and it's now on my TODO list. I'll aim to get this done by the end of August.
You're right about the sandboxing too -- the Fexl interpreter runs with two limits: the maximum number of cycles, and the maximum amount of memory. If it reaches either of those limits, it halts. So it should be perfectly safe to run it on the public web.
Update: It might not be ready by the end of August. I just returned from a week-long trip, and I'm now busy rewriting the whole Fexl interpreter in terms of pure combinatorics (see http://fexl.com/#combinatorics ).
The interpreter understands only the two fundamental combinators S and C. Although it is possible to define high-order combinators in the interpreter using function pointers, I am limiting myself to S and C to minimize the code size and illustrate the principle that all computable functions can be defined as applications of only S and C.
So, for example, even the Y combinator itself is defined in terms of S and C as:
\I = (S C C)
\Q = (S (S (C S) C) (C (S I I)))
\Y = (S Q Q)
I'm now doing some bootstrapping, eliminating code formerly written in Perl. I'm writing the Fexl parser in Fexl, and manually converting it to S and C. I can then eliminate the parser written in Perl.
Thanks, that's very good to know. I've made excellent progress on the Perl implementation, doing everything in terms of combinatorics. Much thought has gone into this.
I think before I put up the sandboxed interpreter at fexl.com, I'll just release a tar.gz file of the Perl code. That way I can release something even earlier. Also, it will give me more time to take a look at a case that causes a segmentation fault. One of my test cases is a "metastasis" function which runs forever using an increasing amount of memory. That function is:
Y S f g
Where Y is the fixpoint function, S is the fusion function, and f and g are any functions.
I can limit the evaluation to let's say 3,000,000 cycles, so it does terminate just fine. But at that point I've built a deeply nested data structure in Perl, and when I drop it and the reference count goes to zero, Perl begins freeing that structure recursively. Well, that causes a stack overflow and hence a segmentation fault.
My workaround was to store a pointer to the big structure in a global variable. That way the structure stays live until the program terminates, at which point Perl simply abandons the structure instead of trying to free it recursively.
Of course, I easily avoid this problem in the C implementation because I free structures lazily.
Another workaround is to write a destroy routine which frees a structure non-recursively. I did try that, but the problem is that it goes all the way down, even destroying my standard structures for the Y and I combinators. So it's only something that could be used once during a program run.
In any case I'd rather not put up the sandboxed interpreter until I can get a good handle on the metastasis case. I just need to work around this one issue with Perl.
I think the code will illustrate how easy it is to augment an existing programming language such as Perl or C with a high-order functional interpreter. I'll keep you posted.
Actually there's a much simpler metastasis function, perhaps the most evil function of all, namely, the result of applying the Y combinator to the Y combinator itself:
Y Y
If you expand this function two steps, you'll notice that it equals:
Y Y (Y (Y Y))
And so you end up with an infinitely left-recursive thing from hell.
Fortunately my implementation of Fexl does not use built-in recursion, which would quickly throw up a segmentation fault due to stack overflow. Fexl does its own stack with a chain of nodes. Also, Fexl runs inside a completely bounded "arena", which is a fixed-size array of machine integers. All node "pointers" are actually just integer offsets into this arena. This makes it possible to increase the size of the arena by allocating a brand new one and doing a mass copy from the old to the new. But I have not bothered with this yet, because I figure that deciding on your upper bound on memory once up front is just fine for now. Note also that my approach allows for nested arenas, so for example you could have a Fexl function which allocates an arena and runs a Fexl interpreter inside there. The whole thing is completely reflective in the most profound way you could imagine, so the sky's the limit. The use of pure functions enables abstractions which are not leaky.
This leads me to my main point. I've decided to stick with the C implementation because it gives me the complete control which I need. When I want to call Fexl from Perl, I will simply fork a separate Fexl process and have Perl pump a Fexl function into its standard input. That initial function can then read any remaining input. You can do any sort of bootstrapping this way, for example the initial function may expect another Fexl function at the head of the remaining input. Any file names on the Fexl command line would simply be treated as if their contents had appeared on standard input first.
The Perl code then reads the standard output of the Fexl process, and that's its answer. On the web, it could connect the output directly to the client socket and be done with it.
One nice thing about this approach is that it's utterly safe. Your code could read the evil (Y Y) function straight off a socket from a known black hat and evaluate it with confidence, knowing that it would simply reach your upper bound on memory -- or cycles, whichever comes first -- and halt in the most ordinary way.
Another nice thing is that I don't have support an endless list of implementations in other languages, or even bindings for other languages. There is one authoritative piece of code written in C and that's that.
If you feel that this approach is too slow, then you can simply move more of your logic from Perl into the Fexl program itself. If you take this principle to its extreme, you would end up with a Perl process which does nothing except feed a program into Fexl -- at which point you would drop Perl altogether. Note that Fexl programs can be extremely compact, and you could even read auxiliary functions from files on demand.
Ultimately the entire API at https://loom.cc will be based on Fexl. Instead of doing a bunch of grungy back-and-forth API calls and doing all your branching and looping on the client side, you can just ship an entire Fexl program to the Loom server and have it run there, delivering precisely the answer you want in the format you want. You could even send back a single number such as "42" if that's all you need.
To avoid shipping the same program to Loom repeatedly, you could store the program in an Archive slot once. Then whenever you want to run that program, you only have to send the archive ID.
It sounds like you're having a lot of fun with it.
I think a C implementation would be better anyway. C is the common denominator between almost every language. Have you already done the foreign function interface to C?
When do you think we might see either source code or a live web demo to play with?
For a web demo, couldn't you just add a quick CGI interface to your C program?
1. Use DeBruin notation for referring to bound lambda values positionally, instead of my current approach of actually using the lambda symbol in an associative list. (Don't laugh, it works.)
2. Change the C implementation so it just uses an array of longs for the whole working space, with integer offsets, instead of node structs with pointers. At that point the whole implementation will be "nothin' but int", and avoid malloc altogether.
3. Change both the Perl and C implementations so that I supply external function definitions as function pointers. Right now when a high-order function is reduced to a normal form (lambda form), I actually look at the name of the external function being called and do a big if-else on that. (Don't laugh, it works.) This will no longer be an option when I go with DeBruin notation. Besides, the function pointer approach will be a lot more solid, serious, and extensible.
Thanks for expressing an interest. When I'm a little more proud of my code and have something available for download, I'll announce it on HN.