Hacker News new | ask | show | jobs
by Animats 3439 days ago
I'm still struggling with my own legacy code problem. I'm reviving an old LISP program from the early 1980s. Parts of it were written for the original Stanford AI Lab SAIL system in the 1970s. It last ran under Franz LISP in 1986.

My current struggle is with one line of code:

   (defun getenode (l) (cadr l))
That ought to be simple enough. But it's being applied not to a list, but a "hunk". A "hunk" is an obsolete MacLISP concept.[1]. It's a block of memory which has N contiguous LISP cells, each with two pointers. This is the memory object underlying structures and arrays in MacLISP. Macros were used to create the illusion of structure data objects, with hunks underneath. However, you could still access a "hunk" with car, cdr, cxr, etc.

I'm converting this to Common LISP, which has real structures, but not hunks. That, with some new macro support, works for the regular structure operations. So far, so good.

But which element of the structure does (cadr l), which usually means the same thing as "(car (cdr l))", access? (cadr (list 0 1 2 4)) returns 1, so you'd think it would be field 1 of the structure. But no. It's more complicated and depends on how hunks are laid out in memory.

The Franz LISP manual from 1983 [2] says "Although hunks are not list cells, you can still access the first two hunk elements with cdr and car and you can access any hunk element with cxr†." At footnote "†", "In a hunk, the function cdr references the first element and car the second." This is backwards from the way lists behave.

A blog posting from 2008 about MacLISP says "A Maclisp hunk was a structure like a cons cell that could hold an arbitrary number of pointers, up to total of 512. Each of these slots in a hunk was referred to as a numbered cxr, with a numbering scheme that went like this: ( cxr-1 cxr-2 cxr-3 ... cxr-n cxr-0 ). No matter how many slots were in the hunk, car was equivalent to (cxr 1 hunk) and cdr was equivalent to (cxr 0 hunk)." Note that element 0 is at the end, which is even stranger. The documentation is silent about what "cadr" would do. Does it get element 2, or get element 0 and then apply "car" to it?

The original code [3] contains no relevant comments. I'm trying to figure out from the context what the original author, Greg Nelson, had in mind. He died in 2015.[4]

[1] http://www.mschaef.com/blog/tech/lisp/car-cdr.html [2] http://www.softwarepreservation.org/projects/LISP/franz/Fran... [3] https://github.com/John-Nagle/pasv/blob/master/src/CPC4/z.li... [4] https://en.wikipedia.org/wiki/Greg_Nelson_(computer_scientis...

5 comments

There are only a few possibilities for what (cadr hunk) could mean. One way to solve this is to try them all and see which one runs successfully.

Based on †, it sounds like (cadr (hunk (hunk 1 2) 3)) should return 2.

Is the old LISP code available online somewhere? I'm curious to see it.

You can run it on an emulated PDP-10 running ITS. A friend of mine has been creating an easy-to-use project setting all of this up. MacLisp is included in the the system.

https://github.com/PDP-10/its

See the link listed above: https://github.com/John-Nagle/pasv/blob/master/src/CPC4/z.li...

I put all the code on Github. The oldest version of each file is exactly what ran in 1986.

The code is delicate. It's a theorem prover, and there's much manipulation of complex data structures, with little explanation of what's going on. The overall theory is documented; this is the original Oppen-Nelson simplifier and there are published papers. But the code has few comments.

getenode is only called in a few places, and each call looks like:

  (getenode znode)
i.e. getenode is only called on znodes. The definition of makeznode is:

  (defun makeznode (node)
         (prog (l znode)
               (setq l (list (list '(1 . 1) node)))
               (xzfield node (list l node nil))
               (setq znode (tellz l node))
               (or (null znode) (eq znode node) (break makeznode))
               (return l)))
So it seems like getenode is called on a regular list whose structure looks like

  (list (list '(1 . 1) node))
Assuming "node" is an enode, the way to access it would be (cadar znode), not (cdar znode). Try changing the definition of getenode to

  (defun getenode (l) (cadar l))
and see if it runs.
I thought that way at first. But the program breaks when (getenode znode) is called on an item of type "node", not a list. This happens when (interntimes) takes the path which ends

    (or (setq znode (tellz l node)) (return t))
    (or (eq node (getenode znode)) (zmerge node (getenode znode)))))
and (tellz) has taken the path which ends (return (baserowz* i)). "baserow*" is an array which contains links to "node" items, not list cells. What "z.lisp" and "ze.lisp" are doing, by the way, is solving systems of linear inequalities by using linear programming on a sparse matrix.

Also, see

    (defun isznode (x) (and (hunkp x) (= (hunksize x) 8))) ; original

    (defun isznode (x) (equal (type-of x) 'node))   ; CL version
which indicate that znodes are hunks/structures, not lists. This is inconsistent, yet somehow it used to work.

(If you want to talk privately about this, I'm at "nagle@animats.com". Too much detail for HN.)

or abuse, ouch!

   (unless (setq znode (tellz l node))
     (return t))
Having done a bit of lisp spelunking myself, I would suggest that you dig into the source of the lisp your program ran under. Trying to figure out behavior from the surrounding context can be rather difficult when you're playing with an archaic, sparsely documented lisp.
Note that this is the order of display:

The order of display of hunk slots is historical in nature. For better or worse, the elements of a hunk display in order except that the 0th element is last, not first. e.g., for a hunk of a length n+1, (cxr1 . cxr2 . ... . cxrn . cxr0 .)

It could still make sense to have the layout in memory being sequential, like (cxr-0 == CDR, cxr-1 == CAR, ...others ...).

Note also that CAR extracts the leftmost element of a hunk, just as it addresses the leftmost element of a cons. Similarly, CDR extracts the rightmost element of hunks and conses.

It seems more logical that CADR is just the combination of CAR with CDR. It don't think the designers would try to transpose the fact that it means "second", with proper lists, for hunks. It just seems unlikely, but I have no proof.

Also:

(Note that the operation CAR is undefined on hunk-1's, but CDR is not.) This means that if you want to make a plist for a hunk of your own, you can use its cdr as a hunk; it does not mean that you can blindly assume that any hunk wants its CDR treated that way. The exact use of the slots of a hunk is up to the creator; it's a good idea to mark your hunks (e.g., by placing a distinctive object in their cxr-1 slot) so that you can tell them from hunks created by other programs.

My guess is that there is some metadata associated with a hunk, stored in CRX-0, a.k.a. CDR.

http://www.maclisp.info/pitmanual/hunks.html

You should consult the Maclisp manuals. There is one in ITS, and the Pitmanual is here: http://maclisp.info/pitmanual/index.html

You should probably also test some code in Maclisp.