Hacker News new | ask | show | jobs
by dfranke 2446 days ago
Reproduced from feedback that I gave pg on an earlier draft (omitting things he seems to have addressed):

When you say,

> But I also believe it will be possible to write efficient implementations based on Bel, by adding restrictions.

I'm having trouble picturing what such restrictions would look like. The difficulty here is that, although you speak of axioms, this is not really an axiomatic specification; it's an operational one, and you've provided primitives that permit a great deal of introspection into that operation. For example, you've defined closures as lists with a particular form, and from your definition of the basic operations on lists it follows that the programmer can introspect into them as such, even at runtime. You can't provide any implementation of closures more efficient than the one you've given without violating your spec, because doing so would change the result of calling car and cdr on closure objects. To change this would not be a mere matter of "adding restrictions"; it would be taking a sledgehammer to a substantial piece of your edifice and replacing it with something new. If closures were their own kind of object and had their own functions for introspection, then a restriction could be that those functions are unavailable at runtime and can be only be used from macros. But there's no sane way to restrict cdr.

A true axiomatic specification would deliberately leave such internals undefined. Closures aren't necessarily lists, they're just values that can be applied to other values and behave the same as any other closure that's equivalent up to alpha, beta, and eta conversion. Natural numbers aren't necessarily lists, they're just values that obey the Peano axioms. The axioms are silent on what happens if you try to take the cdr of one, so that's left to the implementation to pick something that can be implemented efficiently.

Another benefit of specifying things in this style is that you get much greater concision than any executable specification can possible give you, without any loss of rigor. Suppose you want to include matrix operations in your standard library. Instead of having to put an implementation of matrix inversion into your spec, you could just write that for all x,

    (or
     (not (is-square-matrix x))
     (singular x)
     (= (* x (inv x))
        (id-matrix (dim x))))
Which presuming you've already specified the constituent functions is every bit as rigorous as giving an implementation. And although you can't automate turning this into something executable (you can straightforwardly specify a halting oracle this way), you can automate turning this into an executable fuzz test that generates a bunch of random matrices and ensures that the specification holds.

If you do stick with an operational spec, it would help to actually give a formal small-step semantics, because without a running implementation to try, some of the prose concerning the primitives and special forms leaves your intent unclear. I'm specifically puzzling over the `where` form, because you haven't explained what you mean by what pair a value comes from or why that pair or its location within it should be unique. What should

   (where '#1(#1 . #1))
evaluate to? Without understanding this I don't really understand the macro system.
3 comments

This is similar to the feedback Dave Moon gave to PG's previous language, Arc, more than a decade ago. http://www.archub.org/arcsug.txt

Representing code as linked lists of conses and symbols does not lead to the fastest compilation speed. More generally, why should the language specification dictate the internal representation to be used by the compiler? That's just crazy! When S-expressions were invented in the 1950s the idea of separating interface from implementation was not yet understood. The representation used by macros (and by anyone else who wants to bypass the surface syntax) should be defined as just an interface, and the implementation underlying it should be up to the compiler. The interface includes constructing expressions, extracting parts of expressions, and testing expressions against patterns. The challenge is to keep the interface as simple as the interface of S-expressions; I think that is doable, for example you could have backquote that looks exactly as in Common Lisp, but returns an <expression> rather than a <cons>. Once the interface is separated from the implementation, the interface and implementation both become extensible, which solves the problem of adding annotations.

This paragraph contributed a lot to my understanding of what "separating interface from implementation" means. Basically your comment is spot on. Instead of an executable spec, there should be a spec that defines as much as users need, and leaves undefined as much as implementors need.

In Clojure, some data structures are implemented as "sequable", which means that they are not implemented as sequences, but they can be converted to sequences if needed. Functions for head and tail are also defined for these structures, for example by internally converting them to sequences first. That means that any function that works with sequences can work with these structures, too.

This seems to me like the proper way to have "separation of interface from implementation" and "everything is a list" at the same time. Yeah, everything is an (interface) list, but not necessarily an (implementation) list.

Python and Rust both have similar "duck"y features where you can Impl iteration or add the necessary __special__ methods to get iteration features.
An object type can implement interface types, as you say.

Object type, and identity, can also be decoupled from implementation type. Dynamically, statically, or with advice. V8 javascript numbers and arrays can have several different underlying implementations, which get swapped depending on how the thing is used (and with extensions, you can even introspect and dispatch on them).

Typing can also be fine grained. So things can narrowly describe what they provide, and algorithms what they require. "I require a readable sequence, with an element type that has a partial order, and with a cursor supporting depth-one backtracking".

Object identity can also be decoupled from type. Ruby's 'refinements' permits lexically scoped alternative dispatch, so an object can locally look like something else. That, and just a few other bits, might have made the Python 2 to 3 transition vastly less painful - 'from past import python2'.

We are regrettably far from having a single language which combines the many valuable things we've had experience with.

Re: operational semantics, you can in fact "cheat" and lots of languages do. It's just difficult. Luajit and other tracing JITs optimize away implementation details or use specialized datastructures in the cases where no code depends on introspection behind the curtain, with trace poisoning and falling back to "proper" evaluation in cases where it's needed.
Right - a Lisp list is conceptually a linked-list. How the compiler chooses to implement the list is up to it - it can use an array as long as it can meet the specification and give useful performance.
> I'm having trouble picturing what such restrictions would look like. [...] there's no sane way to restrict cdr.

I guess you could question its sanity, but the obvious way would be to say that any object (cons cell) produced by make-closure or (successfully) passed to apply is/becomes a "closure object", and car/cdr will either fail at runtime or deoptimise it into its 'official' representation.