Hacker News new | ask | show | jobs
by creepycrawler 1274 days ago
When baby Lispers are born, the first thing they are taught is a dichotomy: the universe is split into conses and atoms. Cons cells are simple and composed of two components, which are called the car and cdr. There are functions to retrieve what's stored in these components, which are called CAR and CDR. Atoms may be as complex as you like. They include objects like numbers, characters, strings, arrays, symbols, etc. NIL is not a cons cell, but a symbol. We can represent lists by chaining cons cells. We may start with an empty list, and by convention this is the symbol NIL. If we also choose to designate NIL as the false value, and everything else as true values, then it is useful to modify CAR and CDR so that they take not only conses, but also the symbol NIL, and return NIL, which is false, and the empty list. We then say that CAR and CDR take a list, i.e. an object of type (or cons null). We do not say that NIL is a cons.

Your [UPDATE] shows that you still don't understand what is meant by "dotted list". I already gave a definition of one, but did not give an example. An example of a dotted list is (a b . c) i.e. the last cons has a cdr that is (i) an atom (otherwise, it wouldn't have been the last cons) and (ii) not NIL (which is the conventional empty list designation).

2 comments

(a b . c), as an object is an improper list.

The printed notation is a dotted list.

By an informal metonymy, the internal object is called a dotted list. It's only an informal usage among Lisp coders. The correct terminology is "improper" for the object and "dotted" for the spelling.

Note that (a b c . nil) is dotted, but it's the same as (a b c) which is proper.

Unfortunately, the Common Lisp specification encodes the "dotted list" informality in the Glossary. Common Lisp does things like that.

Furthermore Common Lisp uses "dotted list" as an essential term denoting a subset of of "improper list". An "improper list" is circular or not terminated by nil. A dotted list is only the latter.

That's in spite of the fact that a circular list will print with the dot notation: #1=(a b c . #1#)! Circular lists are dotted when completely printed, under the circle notation.

Another problem is that the append function and others support the idea of a non-nil atom being an empty dotted list. For instance (append '(a b c) 'd) will work and produce (a b c . d). Yet the "dotted list" definition excludes such an atom. If we go by the presence of a dot, that is correct, but then we know from circular lists not being dotted that that isn't the criterion.

Right, I just wrote about it in my reply to "lisper". The name "dotted list" may not have been the best choice, but that's not an outlier in human history.
You don't need to be quite so condescending. I understand all this perfectly well. But the concept of a "dotted list" has to do with how a list is serialized, not how it is represented in the internally, which is what I am talking about.

NIL is weird in CL and different implementations handle this weirdness in different ways. One way to handle it is to represent NIL as a symbol whose name is "NIL" and write CAR and CDR to recognize when they see this symbol and return it. Another way to handle it is to represent NIL as a cons cell whose CAR and CDR point to itself and write SYMBOLP and SYMBOL-NAME to recognize when they see this privileged cons cell and return T and "NIL" respectively. (There are a lot of other special cases -- this is not an exhaustive list.)

However you slice it, the concept of a dotted list is a non-sequitur because that has NOTHING to do with how NIL is implemented internally, which what I am talking about.

The name "dotted list" comes from how the list is serialized, yes, but not the concept. You can write a function dotted-list-p that takes an object and returns the value of (and (consp object) (cdr (last object))). Indeed, it is not related to the NIL issue, and I made no such assumption. You can read an equivalent definition that uses different words in the CLHS glossary.

You admit to talking about "how NIL is implemented internally", hence my original remark that you are conflating implementation tricks with language semantics. From language semantics point of view, NIL is not weird, it is an ordinary symbol, like NIK or NIM. Lisp tradition assigns it certain roles, like representing the empty list, representing the false value, representing the empty type, and also has conveniences like having NIL evaluate to itself or having CAR and CDR take lists instead of conses.

I didn't mean to be condescending. I know you are not a baby Lisper. The matter at hand is very basic, and I understand the desire to present a more sophisticated take, but believe it leads to (and reflects) a distorted ontological view of Lisp if taken seriously. NIL is not weird, it is a simple symbol. We just assigned it a few roles and made a few conveniences. Why did we pick it for those roles? Arbitrary choice. Why did we choose to extend the domain of CAR and CDR? Practical choice. Let the puritans complain and build their own ivory tower languages. Lisp is pragmatic. Could an implementor choose to represent NIL as a closet cons for simple CAR and CDR implementations and special case everything else? Sure. But this has nothing to do with the ontology of Lisp.

> having CAR and CDR take lists instead of conses.

(car nil) and (cdr nil) safely returning nil was introduced in InterLisp, according to Gabriel's HOPL palper. At some big Lisp summit in the early 1970's, InterLisp decided to adopt MacLisp's readtables, and MacLisp adopted (car nil) -> nil.

Why the empty list is a symbol is natural: math uses symbols to refer to such things. For instance the empty set is notated both {} and ∅: empty braces or the special null set symbol.

I mean, why have symbols refer to things in a language that is consciously oriented toward symbolic processing, whose initial creators were people with math backgrounds? It's pretty much a forgone no-brainer.

Having a symbol DENOTE the empty list and having a symbol be THE SAME OBJECT as the empty list are two very different things. The former is perfectly reasonable. The latter causes no end of difficulties.
The former is somewhat reasonable for having pi denote 3.14159... and such.

It brings in its own difficulties. If nil is a variable/constant which evaluates to some nil object, then to talk about nil itself, we have to quote it.

The object which it denotes doesn't print as nil; it has its own printed rep like () and we will end up seeing that printed rep and using it. So then we have two nil representations to deal with: the nil variable which we can use in evaluated contexts, and the literal nil like () that we use elsewhere.

If () isn't self-evaluating (like the criminally stupid design in Scheme), we have to quote it: '().

If we have additional semantic roles for () like it being Boolean false and the bottom of the type spindle, those uses are not nicely served by the () notation, from an esthetic point of view.

The list terminator is de facto a symbol because it's exploited for its identity: we care whether the cdr of a cons is or is not nil, and that's all. That's a symbolic behavior; we might as well complete things so that the symbol functions work with nil; it can have a property list, name and so on.

> to talk about nil itself, we have to quote it.

So? That's no different than if you want to talk about 'pi rather than pi a.k.a. 3.14159...

> So then we have two nil representations to deal with

No different than "pi" and "3.14159".

> If () isn't self-evaluating (like the criminally stupid design in Scheme), we have to quote it: '().

I agree, () should be self-evaluating just like numbers and vectors. The behavior of () should be analogous to the behavior of 0 or 0.0 or #() or "".

> If we have additional semantic roles for () like it being Boolean false

And why would you want to do a stupid thing like that?

> The list terminator is de facto a symbol because it's exploited for its identity

First, the list terminator need not be the same thing as the empty list. The list terminator is an implementation thing. The empty list is a language-semantics thing. These need not be the same.

Second, neither of these need to be unique. There can be multiple list terminators, and there can be multiple empty lists, just as there can be multiple empty vectors and multiple empty strings and even multiple instances of the same number (which actually happens with bignums).

> we care whether the cdr of a cons is or is not nil

No, what we care about -- or at least what we should care about -- is whether the CDR of a cons is an (n.b. not the) empty list. (eq #() #()) need not be true (in fact, generally isn't). Why should (eq () ()) be any different?

And BTW under no circumstances should taking the CAR or CDR of an empty list do anything other than signal an error.

> The name "dotted list" comes from how the list is serialized, yes, but not the concept.

Yes, it does. Dotted lists are entirely related to serialization. The only reason they matter is that, by convention, (a b ... z) is a shorthand notation for (a . (b . (... (z . nil)) ...) and (a b ... z . anything-but-nil) is a shorthand notation for (a . (b . (... (z . anything-but-nil)) ...)

> You can write a function dotted-list-p

Of course you can. So what? You can write a predicate for any (computable) property. I can write a function list-ends-in-3-p. That doesn't mean that lists that end in 3 matter.

The reason "dotted lists" are called dotted lists is entirely because of their serialization behavior: dotted lists have a dot in their serialization and non-dotted-lists don't. And the reason that the I/O behavior is what it is is that it turns out that punning cons cells as linked lists is a useful hack (or at least it was in 1958). But it is only a hack. There is no reason that the data structure used to represent pairs has to be the same data structure that is used to represent linked lists. Indeed, one could argue that this punning is actually a serious mistake. There should be pairs, with CAR and CDR fields which can take on any value, and there should be (linked) lists, with a FIRST field that can take on any value, and a REST field that is restricted to only contain another linked list (including a privileged empty list object). In such a design, the whole concept of "dotted list" would be non-sensical. You could still write (a . nil) or (a . ()) but that would no longer be the same object as (a), the former being a pair and the latter being a list.

So you see the concept of dotted list is rooted entirely in an I/O hack that John McCarthy invented back in 1958 so he could build linked lists out of punned pairs rather than make them a separate data type.

> NIL is not weird, it is an ordinary symbol, like NIK or NIM

No, NIL is not an "ordinary symbol". NIL is both a symbol and a list. NIL answers T to LISTP. No other symbol does that. You can call CAR, CDR, LENGTH, ASSOC etc. on NIL and not get an error. You cannot do that with any other symbol.

NIL is the only symbol to which you cannot give a function binding. In some implementations, attempting to do so triggers its own error message:

    Clozure Common Lisp Version 1.12.1 (v1.12.1-10-gca107b94) DarwinX8664
    ? (defun nil () t)
    > Error: Using NIL as a function name is silly.
Also, NIL answers T to LISTP but not to CONSP. So you can call CAR and CDR on it but not RPLACA and RPLACD. And, at the risk of stating the obvious, NIL answers T to NULL.

NIL is the only object with these properties. That is the very definition of "weird". (And note that I have made no reference to implementation details here.)

> The matter at hand is very basic

No, it isn't, or we wouldn't be arguing about it.

The point was that dotted-list-p doesn't have anything to do with the "serialized form" (printed representation) of the list. Dotted lists are important because they are one of the two types of improper list, the other type being circular lists. Many Lisp functions work on proper lists.

Of course NIL is an ordinary symbol. It has roles, the same way other symbols have roles. Are you saying the symbol &BODY is not an ordinary symbol because it is a lambda list keyword? Are you saying the symbol T is not an ordinary symbol because type boolean is (member t nil)? Are you saying symbol * is not an ordinary symbol because it is a special variable bound by the REPL, it is used in declaration syntax, and also is the name of the product function? Are you saying MUMBLE:VAPORIZE is not an ordinary symbol because it names the most dangerous operation in the mumble library? They are all ordinary symbols, sometimes with unique roles. Because NIL was chosen to satisfy the role of the empty list, is it any wonder that type list is (or cons null), type null is (eql nil), LENGTH and ASSOC and other list functions can deal with it etc.? Of course not. That doesn't make NIL qua symbol any weirder than any other symbol that has particular roles in a given context.

There are many symbols in Common Lisp that you cannot DEFUN. See section 11.1.2.1.2 in the hyperspec. Since NIL names a constant variable but not a standardized function, macro, or special operator, you can bind it to a function lexically using FLET or LABELS.

Once again, NIL is not a cons, it is an atomic symbol that also represents the empty list, so it's no wonder that LISTP returns true and CONSP returns false. CAR and CDR take lists (again see their entries in the CLHS) so it's no wonder that they can work with it. RPLACA and RPLACD take conses only so it's no wonder that they don't. At the risk of stating the obvious (again and again) the type null is (eql nil) so it's no wonder that NULL returns T for it. That operators treat a symbol specially does not make a symbol weird. It merely means that it serves a particular role. Any symbol can take any number of roles within a program. That doesn't make the symbol weird. I will repeat once more, because I've a feeling you didn't get this. It doesn't make a symbol weird.

It is a basic matter, definitely. I learned all this stuff about conses, atoms, symbols, lists, etc. very early on, as baby Lisper. I don't know why someone who spent 35 years programming Lisp should have trouble understanding all this, and has to keep bringing up new irrelevant/incorrect claims instead of just opening a beginner's Lisp book and re-reading the first chapter or two, where they talk about all this stuff.

> dotted-list-p doesn't have anything to do with the "serialized form" (printed representation) of the list

It does: the details of the serialization are the reason that the concept of "dotted list" even exists. At the risk of belaboring the obvious, dotted lists are called "dotted lists" because there is a syntactically-significant dot in their serialization.

> There are many symbols in Common Lisp that you cannot DEFUN. See section 11.1.2.1.2 in the hyperspec.

It is not that you cannot defun them, it is that this is undefined behavior. At least one implementation (CCL) lets you defun just about everything except NIL:

    ? (defun &body () t)
    &BODY
    ? (&body)
    T
    ? (defun t () nil)
    T
    ? (t)
    NIL
But, as noted earlier...

    ? (defun nil () t)
    > Error: Using NIL as a function name is silly.
> Since NIL names a constant variable but not a standardized function, macro, or special operator, you can bind it to a function lexically using FLET or LABELS.

NIL's lack of weirdness in this one regard actually seems pretty weird to me. IMHO it would actually makes sense to prohibit defining a function whose name is the same object that designates an empty list, just as it makes sense to prohibit defining functions whose names are (say) numbers.

> That operators treat a symbol specially does not make a symbol weird.

That all turns on how you define "weird". One dictionary definition of "weird" is "strikingly odd or unusual, especially in an unsettling way; strange". The fact that the CAR and CDR of NIL are both NIL seems weird to me. The fact that NIL is the canonical boolean false is NIL rather than F seems weird to me. (Actually, the canonical booleans, if they were going to be symbols at all, should have been :T and :F or :TRUE and :FALSE. But that's a different discussion.)

> Are you saying the symbol &BODY is not an ordinary symbol because it is a lambda list keyword?

Yes. Lambda-list keywords are weird. They behave differently from X and Y and Z and BAZ and BAR and BING and BIFF and BOFF and ORDINARY-SYMBOL and even WEIRD-SYMBOL. The same is true for symbols interned in the keyword package. All of these things are weird.