Hacker News new | ask | show | jobs
A different take on S-expressions (gist.github.com)
64 points by tearflake 367 days ago
14 comments

Lisp programmer here.

Traditional S-expressions, by their definition, ignore most of whitespace; additionally, reading sexprs is always a linear operation without the need to backtrack by more than one character.

The suggestion from this post violates both assumptions by introducing a 2D structure to code. To quote this post's examples, it requires the multiline string in

    (fst-atom """   trd-atom)
              00001
              00002
              00003
                """
to be fully read before TRD-ATOM. It also forces the reading function to jump up and down vertically in order to read the structure in

    * (                               )  
    *   e (           ) (           )    
    *   q   m (     )     p (     )     *
            u   a a       o   a 2       *
            l             w             *
The author also states that

    (eq (mul (a a)) (pow (a 2)))
is less readable than

    * (                                                  )  
    *   *eq* (                   ) (                   )    
    *          *mul* (         )     *pow* (         )     *
                       *a* *a*               *a* *2*       *
                                                           *
Then there's the ending passage:

> we hope that the introduced complexity is justified by the data readability expressed this way.

I cannot force myself to read this post as anything but a very poor Befungesque joke.

It gets worse/better. Since Racket allows you to hook your own reader in front of (or in place of) the default reader, you can have things like 2D syntax:

    #lang 2d racket
    (require 2d/match)
     
    (define (subtype? a b)
      #2dmatch
      ╔══════════╦══════════╦═══════╦══════════╗
      ║   a  b   ║ 'Integer ║ 'Real ║ 'Complex ║
      ╠══════════╬══════════╩═══════╩══════════╣
      ║ 'Integer ║             #t              ║
      ╠══════════╬══════════╗                  ║
      ║ 'Real    ║          ║                  ║
      ╠══════════╣          ╚═══════╗          ║
      ║ 'Complex ║        #f        ║          ║
      ╚══════════╩══════════════════╩══════════╝)
https://docs.racket-lang.org/2d/index.html
Truth be told, you can intercept the reader in Common Lisp, too, and here it actually makes some sense since the 2D value is immediately visually grokkable as an ASCII-art table. The proposed 2D sexpr notation does not have this.
That's amazing and terrible at the same time. I love it.
A normal tree would be easier to read

            eq
       mul      pow
     a    a    a   2
Turned 90, maybe?

  eq:
    mul:
      a
      a 
    pow:
      a
      2
That's the classical LISP way of doing it:

    (eq (mul a
             a)
        (pow a
             2))
or

    (eq
        (mul
             a
             a
        )
        (pow
             a
             2
        )
    )
x*x == pow(x,2)
We have a winner!

Actually, I'd suggest a slight improvement: x*x = x^2

x · x = x²
(== (* x x) (pow x 2))
(= * x x ^ x 2)
no yaml programming please :(
From the YAML inventor himself: https://yamlscript.org/

The length people go to avoid Lisp, only to reinvent it, badly.

Yes that part must be a joke!

I’ve seen dozens of attempts to make S-Exp “better” even the original M-Exp. I also did some experiments myself. But at the end, I come back to goo’ol s-exp. Seems to be a maximum (or minimum) found just perchance.

Here is another example, an axiom from propositional logic:

    (impl (impl p (impl q r)) (impl (impl p q) (impl p r)))
which, vertically indented in a transposed block, looks like this:

    * (                                               )
    *   i (               ) (                       )
    *   m   i p (       )     i (       ) (       )
        p   m     i q r       m   i p q     i p r
        l   p     m           p   m         m           *
            l     p           l   p         p           *
                  l               l         l           *
which, using transposed lines within the transposed block, finally looks like this:

    * (                                                                                           )
    *   *impl* (                               ) (                                              )   *
    *            *impl* *p* (                )     *impl* (                ) (                )     *
                              *impl* *q* *r*                *impl* *p* *q*     *impl* *p* *r*       *
This time I won't make any judgements. Could be good, could be bad, you decide.
Not sure if that example helps. You can make any programming language hard to read without some basic formatting. The way I would write the sexpr would be:

  (impl
    (impl 
       p 
       (impl q r))
    (impl
       (impl p q)
       (impl p r)))
It's clear when each section begins and ends and doesn't require complex parsing rules.
That looks clean, can't argue that.
Thanks for restoring my sanity. Was quite confused of the value added by the author.
Sorry for the confusion. I must be a very disturbed person because I kind of like what is explained there.
Here, I brought down the enthusiasm a bit in the closing word. I hope it creates less confusion now.
These definitely are extensions that you could add to S-expressions, no one can disagree there.
Related but not the same at all, Racket has a 2D syntax (add on mode) that gives a different way to program tables where the output depends on two different inputs.

https://docs.racket-lang.org/2d/

Feels like the complete opposite s-expressions which are the easiest possible thing to parse, this sounds like a complete nightmare to write a parser for.

It might even be easier to treat the input string as a 2D grid than as a sequence and have a parsing head that behaves like a 2x2 convolutional kernel...

This would make for either a great Advent of Code, or a nightmare interview question, I love it.

Actually it's quite simple. We parse from left to right. When we hit EOL, we return to the beginning of line and increase Y by one.

Blocks are parsed in the following way: when we get the beginning count of block opening characters, we move Y by one, loop right while whitespace, until we encounter ending count of block characters.

In transposed block, we just switch X and Y, it is easily done with pointers, and use the same code.

    (fst-atom """   trd-atom frt-atom
      """     00001
      asdf    00002 """    fth-atom)
      qwer    00003 hahaha
      zxcv      """ hehehe
      """           hohoho
                    """
I'm not sure I'd like the above to be parseable.
For very large values of "somewhat peculiar"...
Changed the "somewhat" to "very" in the document, thank you.
I'll piggyback with my gruesome JSONification of S-expressions. I kinda liked having two kinds of braces [straight] and {curly} to differentiate arrays and objects, and I did have a event-loop-based "parallel" scheduler working to process a tree as soon as prerequisites were fulfilled. I might pick up the old project again someday, I just got hung up on how I wanted to handle error bubbling.

With a vertical script like japanese you could easily rotate the whole program 90 degrees to the right (as shown at the bottom of the landing page)

https://web.archive.org/web/20240904091932/https://lookalive...

  {
  "#!join": [
    [
      "A triangle with side of ",
      "#& side",
      " and base of ",
      "#& base",
      "has a hypotenuse of",
      {
        "#!sqrt": [
          [
            {
              "#!sum": [
                [
                  "#!multiply side side",
                  "#!multiply base base"
                ]
              ]
            }
          ]
        ]
      }
    ]
  ]
}
This is bonkers and I love it.
Ikr? People should loosen a bit, why should everything be so serious?
This is fine and interesting, but what I think is lacking in S-expression isn't funky vertical syntax, but a way to directly represent objects that are not lists. Otherwise one needs to invent some representation on top of S-expressions (and then a list isn't necessarily a list anymore; everything goes through an additional layer of encoding/decoding, losing the elegance of S-expressions), or use some extension syntax (usually involving '#'), which varies from language to language and might not even be interpreted by the reader (but logically expand to some list expression that needs to be interpreted again later, so you're not really any better off than with the first approach).

I kind of want something like, to borrow JSON-like syntax and gloss over namespacing issues:

  (foo .
    {type: listy-cons-cell
     head: bar
     tail: (baz quux)})
...which would be another way to say (foo bar baz quuz), but would make it possible to represent any data structure you like at the same level as atoms, strings, and lists.
See Clojure’s reader syntax: https://www.clojure.org/reference/reader

You can have vectors, hash maps, and sets in addition to lists, symbols, and keywords.

I don't get why anyone even tries after Clojure. They got it 100% right. It's easier to read than anything else, and still super simple to parse. Commas are whitespace, use them or don't, where ever you want. Namespaced keywords are great. The data structures themselves act as functions. It's just... done.
Yep. There are a couple of places where I think Clojure made the wrong choices, but they are few and far between. Overall, there is so much that was done correctly. I don’t even know if I could program without persistent data structures anymore, for instance.
Kernel has first-class environments which aren't just lists, but can be constructed from lists. Environments are encapsulated, so we can't simply peek into them with car and cdr - we can only obtain the value associated with a given symbol by evaluating the symbol in that environment.

    ($define! foo
        ($bindings->environment
            (bar "Hello World")
            (baz 1234)
            (qux #f)))
            
    ($remote-eval bar foo)          ==> "Hello World"

    foo                             ==> #[environment]
We could perhaps make something a bit more friendly. Lets create an encapsulated `struct` type which could give us the contents as a plain list, or let us look up each field:

    ($provide! ($struct struct? destruct $get)
            
        ($define! (struct-intro struct? struct-elim) 
            (make-encapsulation-type))
                
        ($define! destruct
            ($lambda (struct)
                (cdr (struct-elim struct))))
    
        ($define! $get
            ($vau (struct member) e
                ($let ((record (car (struct-elim (eval struct e)))))
                    (eval member record))))
                    
        ($define! zip
            ($lambda (keys values)
                ($if ($and? (null? keys) (null? values))
                     ()
                     (cons (list (car keys) (car values)) (zip (cdr keys) (cdr values))))))
                    
        ($define! $struct
            ($vau kvpairs env
                ($let* ((keys (map car kvpairs))
                        (values (map ($lambda (pair) (eval (cadr pair) env)) kvpairs))
                        (record (apply (wrap $bindings->environment) (zip keys values))))
                    (struct-intro (cons record values))))))
Example usage:

    ($define! foo
        ($struct
            (bar "Hello World")
            (baz (+ 12 43))
            (qux #f)))              ==> #inert
            
    (struct? foo)                   ==> #t
    (pair? foo)                     ==> #f
    (environment? foo)              ==> #f
    
    (destruct foo)                  ==> ("Hello World" 55 #f)
    
    ($get foo bar)                  ==> "Hello World"
    ($get foo baz)                  ==> 55
    ($get foo qux)                  ==> #f
    ($get foo foo)                  ==> ERROR: Unbound symbol: foo

    foo                             ==> #[encapsulation]
Kernel: https://web.cs.wpi.edu/~jshutt/kernel.html

Klisp (essentially complete implementation of Kernel): https://github.com/dbohdan/klisp

Is the empty list also an s-expression? If so, would it not be slightly more correct to define s-expressions as either an atom or a list of zero or more s-expressions?
An empty list is atomic! What could it break down into?
Where is the grotesque code to parse these rectangles? :)
I can only ask “…but why?” I don’t get what this complexity buys us.
Solves a mess with closing parens. But I'm not sure if it is worth of hassle. Anyway, it exists.
dispense with the parentheses:

  (eq (mul (a a)) (pow (a 2)))
becomes

  eq
    mul
      a a
    pow
      a 2
That's Wisp, I don't care for it, but people who really like to assign semantic meaning to precise counts of invisible characters may find it interesting.
Absolutely no counting is required at all so I think your joke falls a little flat.

Our visual system has the ability to detect implied straight lines (and other simple geometric outlines) from very small clues.

Therefore "seeing" the vertical lines implied by the indentation is effortless - so it's immediately obvious which elements belong to each other.

Indentation is an incredibly valuable "brain" hack that manages to instantly communicate hierarchy, not something to be sneered at.

We have no such innate ability to match parenthesis - determining hierarchy in a jumble of open and close parenthesis requires precise counting or, typically these days, tool/editor/IDE support.

I also don't know why this is treated as controversial either: the first thing every project does is declare a canonical code formatting aka whitespace layout and start rejecting patches which don't follow it.
> Absolutely no counting is required at all so I think your joke falls a little flat.

Really? How do you see the difference between "TAB" and "SPACE SPACE TAB"?

The solution to that is easy: do not use tabs.
That is not the approach taken by any system that I know of that uses semantic whitespace. Tabs are meaningful in Python and make, and in fact required in make.
That’s a design flaw of ASCII inherited by UNICODE not an argument against using indentation to portray hierarchy. I’m sure there are alternative UNICODE characters which render almost indistinguishable from “(“ also.
there's a hybrid form (sweet-expressions ? i forgot), top-level terms are parens-free

    eq (mul a a)
       (pow a 2)

    defun min (a b)
      (if (a < b) a b)

IIRC the hack to support this at read time was minimal, and it made a big impact in terms of "mainstream appeal"
IMO, make some syntax element that means "each line indented from this one has one single element, unless indented further".

    eq :
        mul a a
        pow a 2

    defun min (a b) :
        if (< a b) a b
That means you can also have haskell-like continuations:

    seq :
        eq (mul a a)
            (pow a 2)
        eq 4 3
the elvish shell uses the syntax in the first example:

    > eq (* 3 3) (math:pow 3 2)
    ▶ $true
(function definitions on the other hand uses smalltalk/ruby style blocks)
Lispython

I should trademark this name.

As a Lisp programmer, just no.
Thank you for the criticism. Lots of lispers share your opinion.
Is this.... is this a joke?
I don't intend to be funny. Just a bit childish, but in a good way :)
I'm scared the people who decide how I write Dune config files will see this and think its a good idea.

Terrified, really.