Hacker News new | ask | show | jobs
by ntoshev 5719 days ago
There is a language and there is an ecosystem surrounding it (libraries, community, etc). Ignoring the ecosystem, the question "to lisp or not to lisp" pretty much boils down to "syntax or macros" - if you want macros, you go with a lisp, if you want syntax, you go with Python or another modern language.

I used to think macros matter more than syntax, because you can freely define your own micro-languages. I didn't really practice it, because of the practical limitations of the available lisps [1]. Now I think the opposite, that syntax matters more. Syntax helps you parse the code visually and you use lower level parts of your cortex to understand the code [2]. You can build arbitrary DSLs in lisps, but they all have no syntax, so they are of limited cognitive help. I think the real win are modern languages with syntax, that is malleable enough to facilitate the cognitive apparatus of the programmer in most cases, or at least most cases that matter. For example, an obvious DSL is the mathematical notation - Python / Ruby handle it well enough with operator overloading, Lisp actually does worse because of the prefix notation.

It is important to understand that you can approximate the bottom-up style of building abstractions with libraries (instead of DSLs), parameterizing the proper things, with minimal syntax noise. The remaining difference between macros and using higher level functions is mostly in run time optimization.

I guess seasoned lispers learn to "see through" all the brackets and engage the lower-level part of the brain in parsing lisp code. Ironically, something similar happens to Java developers - after enough hours looking at the code they start to ignore the ugly try/catch clauses that can't be properly abstracted away because of language limitatons.

[1] with the exception of one big project in Common Lisp, but I did only a little programming in it, and this was before I fully appreciated macros - but the guy before me used them extensively to build two layers of domain specific languages

[2] L Peter Deutsch talks about this in Coders at Work and this is probably more valuable than what I have to say:

Deutsch: I can tell you why I don’t want to work with Lisp syntax anymore. There are two reasons. Number one, and I alluded to this earlier, is that the older I’ve gotten, the more important it is to me that the density of information per square inch in front of my face is high. The density of information per square inch in infix languages is higher than in Lisp.

Seibel: But almost all languages are, in fact, prefix, except for a small handful of arithmetic operators.

Deutsch: That’s not actually true. In Python, for example, it’s not true for list, tuple, and dictionary construction. That’s done with bracketing. String formatting is done infix.

Seibel: As it is in Common Lisp with FORMAT.

Deutsch: OK, right. But the things that aren’t done infix; the common ones, being loops and conditionals, are not prefix. They’re done by alternating keywords and what it is they apply to. In that respect they are actually more verbose than Lisp. But that brings me to the other half, the other reason why I like Python syntax better, which is that Lisp is lexically pretty monotonous.

Seibel: I think Larry Wall described it as a bowl of oatmeal with fingernail clippings in it.

Deutsch: Well, my description of Perl is something that looks like it came out of the wrong end of a dog. I think Larry Wall has a lot of nerve talking about language design—Perl is an abomination as a language. But let’s not go there. If you look at a piece of Lisp code, in order to extract its meaning there are two things that you have to do that you don’t have to do in a language like Python.

First you have to filter out all those damn parentheses. It’s not intellectual work but your brain does understanding at multiple levels and I think the first thing it does is symbol recognition. So it’s going to recognize all those parenthesis symbols and then you have to filter them out at a higher level.

So you’re making the brain symbol-recognition mechanism do extra work. These days it may be that the arithmetic functions in Lisp are actually spelled with their common names, I mean, you write plus sign and multiply sign and so forth.

Seibel: Yes.

Deutsch: Alright, so the second thing I was going to say you have to do, you don’t actually have to do anymore, which is understanding those things using token recognition rather than symbol recognition, which also happens at a higher level in your brain. Then there’s a third thing, which may seem like a small thing but I don’t think it is. Which is that in an infix world, every operator is next to both of its operands. In a prefix world it isn’t. You have to do more work to see the other operand. You know, these all sound like small things. But to me the biggest one is the density of information per square inch.

Seibel: But the fact that Lisp’s basic syntax, the lexical syntax, is pretty close to the abstract syntax tree of the program does permit the language to support macros. And macros allow you to create syntactic abstraction, which is the best way to compress what you’re looking at.

Deutsch: Yes, it is.

Seibel: In my Lisp book I wrote a chapter about parsing binary files, using ID3 tags in MP3 files as an example. And the nice thing about that is you can use this style of programming where you take the specification—in this case the ID3 spec—put parentheses around it, and then make that be the code you want.

Deutsch: Right.

Seibel: So my description of how to parse an ID3 header is essentially exactly as many tokens as the specification for an ID3 header.

Deutsch: Well, the interesting thing is I did almost exactly the same thing in Python. I had a situation where I had to parse really quite a complex file format. It was one of the more complex music file formats. So in Python I wrote a set of classes that provided both parsing and pretty printing. The correspondence between the class construction and the method name is all done in a common superclass. So this is all done object-oriented; you don’t need a macro facility. It doesn’t look quite as nice as some other way you might do it, but what you get is something that is approximately as readable as the corresponding Lisp macros. There are some things that you can do in a cleaner and more general way in Lisp. I don’t disagree with that. If you look at the code for Ghostscript, Ghostscript is all written in C. But it’s C augmented with hundreds of preprocessor macros. So in effect, in order to write code that’s going to become part of Ghostscript, you have to learn not only C, but you have to learn what amounts to an extended language. So you can do things like that in C; you do them when you have to. It happens in every language. In Python I have my own what amount to little extensions to Python. They’re not syntactic extensions; they’re classes, they’re mixins—many of them are mixins that augment what most people think of as the semantics of the language. You get one set of facilities for doing that in Python, you get a different set in Lisp. Some people like one better, some people like the other better.

1 comments

Lisp has a lot syntax. It is just a bit different and it looks different externally.

Lisp has a 2-stage syntax.

The first stage is the syntax of s-expressions, which is surprisingly complex. S-Expressions provide a textual syntax for data: symbols, lists, pairs, strings, various number formats, arrays, characters, pathnames, ...

The first stage is implemented by the 'reader' and can be reprogrammed by an ancient API to the reader via read tables.

The second stage is the syntax of the Lisp programming language. This is defined on top of s-expressions and is really a syntax over data structures (not text). This Lisp syntax deals with: data items, function calls, special forms (thirty something) and macro forms.

This syntax stage is implemented as part of the interpreter/compiler (EVAL, COMPILE, COMPILE-FILE) and can be extended by writing macros, symbol macros and compiler macros. In earlier dialects it could also be extended by writing so-called FEXPRs, functions which get called with unevaluated source code (-> data in Lisp).

So, we get a lot of complex syntax due to special forms and macros. It just looks a bit different, since the data syntax is always underneath it (unless one uses a different reader).

For example a function definition would be:

   (defun foo (a b) (+ (sin a) (sin b)))
The syntax for that is:

    defun function-name lambda-list [[declaration* | documentation]] form*
With more complex syntax for 'function-name', 'lambda-list' and 'declaration'.

Lambda-list has this syntax:

    lambda-list::= (var* 
                    [&optional {var | (var [init-form [supplied-p-parameter]])}*] 
                    [&rest var] 
                    [&key {var | ({var | (keyword-name var)}
                      [init-form [supplied-p-parameter]])}*
                      [&allow-other-keys]] 
                    [&aux {var | (var [init-form])}*])

Not every valid Lisp program has an external representation as an s-expression - because it can be constructed internally and can contain objects which can't be read back.

Not every s-expression is a valid Lisp program. Actually most s-expressions are not valid Lisp programs.

For example

   (defun foo bar)
is not a valid Lisp program. It violates the syntax above.
"The first stage is implemented by the 'reader' and can be reprogrammed by an ancient API to the reader via read tables."

Readtables aren't any more ancient than the rest of ANSI INCITS 226-1994 (R2004) (the language standard formerly known as X3.226-1994 (R1999)), but the interface to them is very low-level and non-modular.

The current readtable is specified by a dynamic variable, so the readtable facility can be made modular with a nicer interface, in a portable manner. This is exactly what the library Named-Readtables does: http://common-lisp.net/project/named-readtables/

Now realize the significance of this: Common Lisp is the only language allowing total unrestricted syntactic extension and <i>modification</i> in a modular and (somewhat) composable way.

I've been using named-readtables for the past month, and between it and Quicklisp, I haven't been this excited about programming in CL since I started (which is 8 years ago, not that long, but I'm not a total noob either).

Read tables existed before CL. CLOS for example not, it was developed for ANSI CL (based on experience with LOOPS and Flavors). See for example READTABLE in Maclisp:

http://maclisp.info/pitmanual/io.html#16.2.7

Also note that I wrote that the API is ancient. It is. It is old and could be easier to use.

'Named readtables' are related to 'syntaxes' on the Lisp Machine. For example source files have a syntax attribute in the header, which switches between the various Lisp dialects (or other languages), including using different readers. This is for example used by the file compiler and Zmacs.

Cool, I didn't know readtables were in Maclisp, or about LM syntaxes.

"It is old and could be easier to use."

Aside from something like named-readtables, how would you design the lowest-level interface to readtables? Or you wouldn't do that, and just specify something like named-readtables to be the interface? I'm curious because this could be something for http://www.cliki.net/Proposed%20Extensions%20To%20ANSI

I haven't thought about it much, but the character level interface is very primitive. Second, what about things like symbols, numbers, etc.? There is no sane way to specify number syntax or symbol syntax. That might be useful. Currently the reader provides an interface on a character level, but not on the level of s-expressions components.
What I think you're saying is that the reader should be customizable in terms of some DSL for a grammar. It would be nice, but I'm not sure how composable it would be (in the general case, I think it would come down to having the behavior of READ dependent upon a black-box current-parser procedure).

The nice thing about readtables is that it exposes what in essence are transition hooks for each character, so you really don't need to care about the grammar of stuff you're not interested in parsing.

OTOH like you said, extending the syntax for numbers or symbols becomes quite hairy. But even with a DSL grammar approach, you'd need to change major parts of the grammar (which means copying and modifying the normative grammar of CL syntax - not that different from grabbing a portable CL reader implementation today (http://twitter.com/vsedach/status/26484049015)).

The big downside is deciding on which class of grammars that will support, and how they will be represented.

You are right of course, but this kind of syntax doesn't help the programmer to cognitively parse the code, which is what I am talking about.
No, why not?

I can parse Lisp quite good.