| HN Mirror

I was thinking of this again this weekend, after spending a couple of hours reading some literate programs again, in light of Knuth's recent (via https://twitter.com/ksmeel/status/1661890306370072576) note https://cs.stanford.edu/~knuth/papers/cvm-note.pdf and associated program https://shreevatsa.github.io/knuth-literate-programs/program..., a CWEB program he wrote a few days ago. It's been 9–10 days here and I'm not sure anyone will read this (I'm surprised HN still lets me reply), but anyway…

It seems that your main complaint is that in Knuth's literate programs, he does not explain enough. To which my answer is: Sure! LP doesn't mean you have to explain everything, you can choose to explain as much you want. It's not a "moralism" like "structured programming"; it's just a tool.

I think everything you're saying is along the same lines:

> The same explanation can be used to excuse the use of traditional (non-LP) programming systems—raising questions about why make a fuss about LP at all at that point if an LP text is going to treat the same types of shortcomings as a given.

Firstly, "excuse" seems to suggest an accusation that something is wrong with non-literate programming, which has not been made except as a joke — LP is just presented as a tool, with the expectation that it won't work for everyone. (Knuth is against moralism in programming style, and has often complained about it, e.g. in the context of defending GOTO and comparing pointers and what not.) So your question is basically asking: why attempt to explain at all, if one is not going to explain everything? The answer is, even without explaining everything, explaining as much as he does seems to work for him; he's been doing this for 40 years now and continues to rave about it.

> But of all the things that Knuth could explain, that's the one thing he does explain.

(I didn't understand this part.)

> immediately dumping a list of includes on the reader

As a rule of thumb, Knuth has said somewhere (maybe in the early days of LP, for TeX), that he targeted around 12 lines per LP section. So I would imagine that if he ever wrote a program that had more than about 12 includes (which is pretty hard to imagine :-)), he would split up the list of includes into multiple sections, presented separately. Below that (like the four or five includes in these examples), there's not much value in splitting it up further. I guess there's a lesson/hint here that splitting something up into sections is not "free" and below some threshold starts having more cost than benefit. (Just like the cost with having lots of short functions in modern-day non-LP programs: Ousterhout's A Philosophy of Software Design has some succinct words about that: https://web.stanford.edu/~ouster/cgi-bin/aposd2ndEdExtract.p....)

In any case, not having separate sections for #includes is not a failure of imagination: if you look at the programs on his webpage, the very first example ("used as a handout for a lecture on literate programming") has #includes separated out: https://shreevatsa.github.io/knuth-literate-programs/program... — note that in section 2, after a self-mocking joke ("First we brush up our Shakespeare by quoting from The Merchant of Venice … This makes the program literate."), he uses printf, and in the next section includes stdio.h ("Since we’re using the printf routine"), and then in section 7 he has another include, with the words "UNIX’s localtime function does most of the work for us, but we need to include another system header file before we can use it." So the fact that he presented includes like this in his early demonstration program (October 1992), and does not bother to do so in later programs (including last week: May 2023) seems worth thinking about.

> The define at the top of the Symmetric Hamiltonian cycles exhibits the same thing just as clearly.

As mentioned before, this is one of the programs that "use the conventions and library of The Stanford GraphBase". If you look at the program (https://shreevatsa.github.io/knuth-literate-programs/program...), it starts with "We use a utility field to record the vertex degrees" and then has "#define deg u.I". This is in fact part of those conventions — in the SGB book index, there's an entry for "utility fields" pointing to pages 38–39 and 284, where this is explained:

> Every Vertex record contains eight subfields. We have already mentioned name and arcs; the other six subfields are called utility fields because they can be used for many different purposes. […] The six utility fields are named u, v, w, x, y, z, and their five possible interpretations are distinguished by adding one of the respective suffixes .I, .S, .G, .V, .A; thus, for example, v->w.I stands for the integer in utility field w of the Vertex record pointed to by v.

> Utility field names are usually given meaningful aliases by means of macro definitions. For example, GB-GAMES defines nickname to mean y.S.

and so on, and the book is full of definitions like "#define source a.V" and "#define back_arcs z.A" — so when the SHAM program begins with "we use a utility field" and then "#define deg u.I", this is established convention, not some wild thing pulled out of nowhere. When Kartik says “presumably a struct whose definition — whose very type name — we haven't even seen yet. (The variable name for the struct is also hard-coded in”, all of these presumptions are wrong: `u` is not a hard-coded variable name but the name of a field in Vertex (over the course of the program we can see v->deg, x->deg, u->deg, a->tip->deg: there's an index entry for "deg"), and the Vertex struct's definition and type name are well-documented (in a published book). I find it easy to take at face value that this was really the order in which Knuth thought of things, and also the level of exposition that he finds most useful (for his intended audience, which is himself).

This points at a problem with literate programming, that I also mentioned in the other thread: because it can be so personal, everyone who uses LP has their own idea of what is most worth explaining, and even ends up building their own LP tools. (Similar things are said about every Lisp programmer ending up with their own idioms and mini-languages.)

> he was already so warped and tainted from years of doing work in the bottom-up tradition to satisfy the compiler that it ends up clouding his vision

Another way of saying this is that Knuth never believes in hiding the fact that you're writing code that a compiler will read and that a computer will execute — in fact he continues to annotate variables with "register" even though compilers ignore it, simply because he likes to be conscious about what instructions the (ideal) machine will execute — he's not a big believer in abstraction and hiding the details by passing to a higher level; he believes in (and somehow manages) constantly being aware of all levels at once.

Yet another way of saying this (my theory) is that when machine code / assembly programs became too hard to maintain, the rest of the world solved the problem by, over time, settling on abstraction, interfaces, higher-level languages, style conventions, information-hiding, and all that. Instead, Knuth has forged his own programming path/style, still basically writing machine-sympathetic programs, but "explaining to human beings what we want a computer to do". It is up to others to merge LP's human-orientation with mainstream ideas… if it will ever happen.

Anyway, it seems that the criticisms we're discussing are mostly based on first impressions. While they are valuable (and indicative of what others' first impressions will be), ultimately I think criticisms based on studying these programs more closely would be more interesting. E.g. has his LP style evolved over time? What can we learn from studying recent programs (there are over a dozen since 2020 alone, like the program above or others posted on his webpage and which I just typeset yesterday at https://shreevatsa.github.io/knuth-literate-programs/program...) — what can we learn from what he chooses to explain and not; why does he make those choices; what way does this seem to help him: can we use these insights (not the same style) for our own programming practice? Things like that.