Hacker News new | ask | show | jobs
by imdhmd 2436 days ago
I am unable to understand the difference between (a b c) and (a b . c)
5 comments

(a b c) is (a b c . nil). That is, (a . (b . (c . nil))). On the other hand, (a b . c) is (a . (b . c)). (Edit: rewrote post)
() is equivalent to an empty list and nil according to this notation, in which case both would be proper lists. But according to pg, in bel, (a b . c) is not a proper list.

Edit: I think I understand now, hmm.

Think of a proper list like a degenerate tree with all the values on the left and structure on the right. A dotted list puts the last value on the right, where nil would usually be.

Dotted lists are pretty rare. They mostly show up in association lists, where key/value pairs are stored as

  ((name . dave) (type . user))
and adding a pair to the front of the list shadows any other pairs with the same key.
While I think this is the correct interpretation, I do not think it's actually implied by the language specification. In fact, I think the notation (a b . c) is simply undefined -- there's no way to formally deduce what it means from what is defined.
> 2. When the second half of a pair is a list, you can omit the dot before it and the parentheses around it. So (a . (b ...)) can be written as (a b ...).

That defines (a b . c). Edit: Note that (b . c) is a list.

It doesn't. That definition applies only to expressions like (a b c) which do not contain the dot operator at all. (a b . c) is not of the form specified. Moreover, every object of the sort defined via (a b c ...) is a proper list, and (a b . c) is, as described in the specification, not a proper list, so it can't be covered by that definition. It's true that (b . c) is a list, but the form (b . c) does not occur in the expression (a b . c) so this isn't relevant. (You'd need the expression under analysis to instead be (a (b . c)).)
The rule I quoted states that (a . (b . c)) can be written as (a b . c).
That would only be true if (b . c) were a proper list. But it’s not. Strictly speaking, it’s not even a list if you go by the definition of list provided before your quoted rule. It’s later called a “dotted list” to distinguish it from a list, which must be nil-terminated.

Your position boils down to claiming that the rule you quoted is intended to cover lists as well as dotted lists. As stated, it only covers the former, which leaves (a b . c) undefined.

Lists are made of cons cells, and a cons cell is simply a two-element array. In a list, the first element of a cons cell contains data and the second element contains a pointer to the next cons cell in the list.

How then should we represent the last cons cell in a list? We have two choices for what goes in the second element of the last cons cell:

1. The last piece of data, or

2. Nothing; i.e. a special non-pointer pointer. In Lisp this is called nil.

If we choose option 1 it's harder for code that traverses the list to be sure it has reached the end. It's also harder to splice a new element (a new cons cell) onto the end of the list dynamically.

Option 1 is the "dotted" form of ending a list and option 2 is the "proper" form.

Option 2 is more common in Lisp. From a type theory point of view, Option 2 restricts the second element of a cons cell to contain either a pointer to a cons cell or nil. This makes reasoning about code, compiling code, and optimizing code easier.

"Also by convention, the cdr of the last cons cell in a list is nil. We call such a nil-terminated structure a proper list. In Emacs Lisp, the symbol nil is both a symbol and a list with no elements. For convenience, the symbol nil is considered to have nil as its cdr (and also as its car)."

https://www.gnu.org/software/emacs/manual/html_node/elisp/Co...

I read the guide from the beginning and (a b . c) stood out to me too.

It's not a form allowed by all the preceding rules, and I wasn't sure what it meant either.

It definitely needs to be defined. Using only the paper, I assumed from the previous definitions that (a b . c) should be parsed as (x y) where x = a and y = b . c which can be re-written using only pair notation as (x . (y . nil) ) or (a . ((b . c) . nil)) and this made it very confusing to try to figure out what is meant by "improper."

Even if a definition for the notation (a b . c) is added, an example of such an improper list fully broken down into pairs would certainly still be appreciated by us readers who don't already think in Lisp :)

Think of the period symbol as the root of a branch in a tree. A true list has all left branches empty. A "dotted list" has to items on both branches at the very tip.