I do agree about homoiconicity. It's great for writing macros, and terrible for everything else. Of course, go too far in the other direction and you get perl, so... yeah.
I don't want to reopen this 50+ years old can of worms, but I have 2 questions:
1. Is it about homoiconicity in general or specifically s-expressions [EDIT: I see GP writes about s-exps specifically, missed it at first]? Prolog, Erlang, TCL, and Rebol (just some examples) are homoiconic, but not s-exps based. What do you think about them?
2. Do you often read code without syntax highlighting and proper indentation? Assuming the code is properly indented and colorized, what makes it so hard to read in your eyes? Take a look for example at snippets in: https://docs.racket-lang.org/quick/index.html#(part._.Local_... - what do you feel is wrong with them? Is it only the placement of parens, or is there something else?
1. It's homoiconicity in general. It's most obvious in arithmetic expressions, but homoiconicity (as far as I've seen) eschews semantic indicators besides function name and argument position. I find that keywords and symbols are almost indispensable for smooth eye-parsing of code.
2. No, I use syntax highlighting and indentation all the time. Trying to put words to my vague thoughts, I think the main issue is that, for example, in a let block the only indication of the meaning of all the items is the single function/macro name at the beginning of the block. Whereas in an algol-type language you have assignment operators in each item to indicate what they mean.
> I find that keywords and symbols are almost indispensable for smooth eye-parsing of code.
Sure, but that's a totally separate issue from homoiconicity. All the examples I mentioned are infix languages (although TCL requires you to opt-in for that):
(sorry for the links to blames, but it's impossible to link to a specific line otherwise)
Prolog lets you define your own operators and you have control over associativity and precedence. TCL `expr` works like `$(( 2 + 3 ))` in BASH (which also doesn't support mathematical operators normally) and by reimplementing it you can also add your own operators. Erlang also allows you to define new operators, although it's much more hassle there, as you need to write a parse transform (which is only possible to do so easily because of homoiconicity of the language). I'm not sure about Rebol/Red - I had only passing contact with it long ago.
In other words, homoiconicity itself has little to do with where the "keywords and symbols" are placed in the code, or whether they are used at all.
Accidentally, Lisps also support infix syntax just like TCL: for most Lisps you can grab a lib/package/source of the `infix` macro, which allows you to write infix arithmetic (and code in general) without problems.
> I think the main issue is that, for example, in a let block the only indication of the meaning of all the items is the single function/macro name at the beginning of the block. Whereas in an algol-type language you have assignment operators in each item to indicate what they mean.
Ok, that's one way of looking at this. On the other hand, one can also say that the repeated use of the operator where the meaning of each line is already determined (by the fact that it's inside `let` block) is useless cruft, which actually hinders comprehension by forcing you to mentally parse one element more on every single line of the assignment block. In Lisp, it's actually trivial to extend the language to include this form of `let`:
(let
((a = some_expr) ;; could be `:=` `<-` `is` instead of `=`
(b = some_other_expr))
some_statements
...)
In Scheme, the fundamental conditional construct called `cond` even has an operator-like construct built-in
;; in Racket
(cond
[some_condition => a_function_called_on_the_result_of_condition_expr_if_its_not_false])
Yet, including this kind of additional syntax is not widespread in the community. Unless the additional syntactic element actually changes the meaning of the code, like in `cond` case, it is seen as superfluous and not needed.
Both opinions have some merit to them. How you feel about both styles is largely determined by what you're familiar with and what you're used to. In the first style, you have to learn to ignore some token in some places as they don't add any meaning to the code. In the second, you need to be careful not to mistake one kind of block for another, and you need to learn what is the relation between subexpressions in each kind of block.
Both approaches require learning. The difference is that you already learned the first one, while you're not familiar with the other. But - and that's what I really wanted to say - there's no difference between readability of the two approaches once you learn them. In other words, Lisp code is as readable for Lisp programmers as JavaScript is for JS programmers. Further, all Lisps are equally readable to (a specific) Lisp programmers, just like all Algol-like languages are readable for JS programmers.
With a bit - and I mean a bit, like a few days; if you want a language you won't find readable after half a year, go for J - of practice you could read Lisp-like code without problems. There's no inherent unreadability to either Lisp- or Algol-like syntaxes - is what I'd like to convince you to :)
>I see GP writes about s-exps specifically, missed it at first
My experience is with s-expressions being a wall of text. I haven't used prolog (end erlang uses the same syntax) enough to have a beef with them. Perhaps I was being over general.
> My experience is with s-expressions being a wall of text.
That's sad, but unfortunately not that rare. Making a Lisp readable is not as easy as doing so in Python.
In my experience, Lisp code written by an experienced lisper who cares about readability can often be much more readable than well-written Python (assuming equal levels of proficiency in their respective languages). On the other hand, the number of ways you could destroy the readability of Lisp source is endless, they are always close by, and are even context-dependent. Beginners or people without the focus on readability use those ways rather liberally.
The situation is the exact opposite: Python defines the lower bound on the code readability ("Readability counts" -> PEP-8 -> linters -> (lately) `black`) while Lisps, in general, don't even have an authoritative PEP-8 equivalent.
On the other hand, Python also has an upper bound on how readable it can be: its syntax is rich, but if you find yourself in a place where it's not rich enough, you're on your own. You could just use the expressive power of the language to hijack the syntax and beat it into shape better suited for your problem, but it will be almost certainly seen as un-Pythonic.
On this point, Lisps have a huge advantage, because you can change the language parser on the fly easily, and you can add whatever syntax sugar you need in a couple of lines of a macro. In other words, Lisps give programmers tools for making their code as readable as they want (and are able to) while at the same time allowing them to write "walls of text" (and honestly, that'd be a very polite way of describing some of the Lisp code I've seen) which - in readability - could be one of the worst among many languages I've seen.
So, what I want to say here is that it's possible - and not that hard - to write readable Lisp code. Unfortunately, a programmer has to both think of readability when writing and have skills to make their ideas on readability into reality.
In effect, yes, there's a lot of Lisp code which is hard to ingest. Some style guides are there or are being created, paredit helps a lot, I'm not aware of any linters yet, but they should start appearing at some point. On the other hand, Lisp code skillfully crafted for readability is rivals (and sometimes surpasses) Python at its best.
I'm not sure what Lisp code you've seen, but I assume it was all of the former kind. This is unfortunate. Without knowing what was it you were reading/working with it's hard to recommend anything, but I found examples in "How to Design Programs" quite readable: https://htdp.org/2019-02-24/part_five.html and there are also other books and Open Source projects with code worth reading, but I'd have to dig through my bookmarks, which I don't have the time for right now, sorry :(
TLDR: Lisps - Schemes, Racket, CL, PicoLisp, Emacs and TXR Lisps to name a few - can be used to write astonishingly readable and to-the-point code, but the languages do absolutely nothing to discourage using them to write the most unreadable mess under the heavens. As for the reasons for this - I've honestly no idea at all.
> Python defines the lower bound on the code readability
Ignoring ;... comments for a moment, if we squash a Lisp program into one line and remove all non-essential whitespace, it's possible to recover it into nicely formatted code by machine, more or less.
> TLDR: Lisps - Schemes, Racket, CL, PicoLisp, Emacs and TXR Lisps to name a few - can be used to write astonishingly readable and to-the-point code, but the languages do absolutely nothing to discourage using them to write the most unreadable mess under the heavens. As for the reasons for this - I've honestly no idea at all.
This view is unbalanced without noting that Javascript, Rust, C, Perl, Java, Scala, Go, Kotlin, ... and a large number of other languages, have the flexible formatting that allows for unreadable code. Ruby, anyone? https://github.com/mame/quine-relay/blob/master/QR.rb
> As for the reasons for this - I've honestly no idea at all.
Bad formatting is a bug that is fixable in the actual code (greatly assisted by automation) and a minor social problem in programming that is treatable with education and experience. Therefore, it is nearly a non-issue.
You're right, of course, on all points (BTW, WTF is with the downvotes??), but they are technicalities of interest to lispers. I omitted these because I wanted to present a convincing argument that fnord123 simply had bad luck and encountered bad Lisp code. And that there's a lot of good, readable code written in s-exps out there.
The resistance to s-exps in the general population of programmers is bad "news" (if something 50 years old can be called that...) for Lisps. It's hard to fight it in general terms. Pointing out that C, JS, PERL, etc. are often much worse in terms of readability - while obviously true - doesn't really help in convincing someone to look at s-exps differently. This is why I chose Python for comparison and tried to present a positive argument, saying that you can write code "even more readable than Python at its best" in Lisp.
I ignored automatic formatting because it's not part of the language, but of tooling. The problem with tooling is that not everyone uses it. I've had a "pleasure" of working with a 50+ kloc Clojure code base written mostly by C programmers who didn't know or care about formatting tools - honestly, it was a nightmare. Of course, each file could be automatically reformatted into something sensible, but the fact that it was written the way it was and the language did nothing to prevent that still stands. In Python, you at least would get the indentation right.
Readability is a hard problem in general. You're right that it's also a matter of education in the community. You're right that it's almost negligible a problem for lispers themselves, as they know how to reformat the code automatically with a single key press. It is a problem, though, for people who come into contact with Lisp code for the first time. I wanted to convince fnord123 that it's not the syntax itself, but rather how it is used that's a problem - like with every other kind of syntax out there, by the way. I'd be extremely happy if he reconsidered and tried to read some of the better-written s-exps based code.
In Lisp languages, we don't use pure S-exps for everything; we have notations. We have 'X instead of (quote X), `(,A ,@B) instead of (list* 'A B), and numerous # notations.
In the area of arithmetic, although the basic operators are functions invoked using (f arg ...), we give them short names like +, -, * and /. Why? The obvious reason is that we would find it irksome to be writing (add ...) and (mul ...).
Lisp can have notations, and they can be had without disturbing the Lisp syntax. Notations that are related to major program organization have payoff.
In TXR Lisp there is relatively small set of new notations, which all have correspondence to S-exp forms, the same way that 'X corresponds to (quote x).
;; slot access
obj.x.y.z --> (qref x y z)
Of course, people are going to prefer this to something like:
(slot-value (slot-value x 'y) 'z)
Then:
;; unbound slot access
.x.y.z --> (uref x y z)
;; method call
obj.x.(f a b) --> (qref x (f a b))
;; x.f(blah).g(foo).xyzzy(x, y) pattern:
x.(f blah).(g foo).(xyzzy x y) ;; looks like this
;; sequence indexing, function calls (the "DWIM" operator)
[array i] --> (dwim array i)
[f x y] --> (dwim f x y)
;; ranges:
a..b --> (rcons a b)
;; slice
[str 0..3] --> [dwim str (rcons 0 3)]
;; Python-like negative indexing:
[str -4..:] --> [dwim str (rcons -4 :)] ;; : means "default value: one index past end sequence (its length)".
;; quasistrings -- recently appeared JavaScript in strikingly similar form!
`@a @b ...` --> (sys:quasi @a " " @b) --> (sys:quasi (sys:var a) " " (sys:var b))
;; word list literals
#"a b c" --> ("a" "b" "c")
;; quasi word list literals
#`a @b c` --> (sys:quasilist `a` `@b` `c`)
Some Lisp syntax is streamlined:
(lambda (a b c : x y . r) ...) ;; a b c required, x y optional, r rest
This works even if the thing in the dot position is a symbol macro expanding to a compound form. The reason is that the code walker/expander will recognize and transform (func ... . rest) into (sys:apply (fun func) ... rest) first, and then expand macros. (I.e. we can't work this into existing Lisps like CL implementations without going down to that level.)
;; the : symbol -- symbol named "" in keyword package:
;; used as a "third boolean" in various places
(func 1 2 : 4) ;; use default value for optional arg, pass 4 for the next one
;; diminishes need for keyword args
;; built-in regex syntax
#/a.*b/
;; C-like character escapes
"\t blah \x1F3 ... \e[32a"
;; multi-line strings with leading whitespace control:
"Four sc \
ore
\ and seven years ago" -> "Four score and seven years ago"
;; Simple commenting-out of object with #;
#; (this is
commented out)
Also, there is no programmable reader in TXR Lisp; no reader macros. I'm not a big fan of reader macros. They are only useful for winning "I can have any damn syntax in my language" arguments. Problem is, the whole territory of "any damn syntax" is a wasteland of bad syntax, nt to mention mutually incompatible syntax.
>I do agree about homoiconicity. It's great for writing macros, and terrible for everything else.
I don't know; i've written code in C, C++, C#, Java, Python, Ruby, Pascal, Delphi, Assembler x86, TCL, Javascript and Common Lisp. Lisp codebases are the cleanest and clearest i've seen by far, although ReasonML/SML/OCaml might be as clean too.
1. Is it about homoiconicity in general or specifically s-expressions [EDIT: I see GP writes about s-exps specifically, missed it at first]? Prolog, Erlang, TCL, and Rebol (just some examples) are homoiconic, but not s-exps based. What do you think about them?
2. Do you often read code without syntax highlighting and proper indentation? Assuming the code is properly indented and colorized, what makes it so hard to read in your eyes? Take a look for example at snippets in: https://docs.racket-lang.org/quick/index.html#(part._.Local_... - what do you feel is wrong with them? Is it only the placement of parens, or is there something else?