Hacker News new | ask | show | jobs
by Joker_vD 2109 days ago
Exercise for the reader: Write a simple arithmetic expression language parser, with numbers, variables, parentheses and working precedence levels. Extend it with "EXP0 EXP1 ... EXPn" syntax for function calls. Now extend it with "VAR1 ... VARn '.' EXP" syntax for lambdas. Yes, with no "fun", or "\", or "λ" to introduce it—you realise you have a function definition only when you arrive at the dot. The function definition spans as long as it makes sense.

Pretty fun, although requires some small hackery (I solved it by testing, at seeing the dot, that the "apply" list I've build to this point contains only variable names, and then re-using it as the "arglist").

2 comments

I don't know why you'd want to write lambda like that.

Long ago, I made a prototype mixfix syntax for Scheme intended for a shell-like REPL with ISWIM overtones. The paren-less function call syntax (terminated by newline or semicolon, like shell) was done by modifying the Pratt engine to treat juxtaposition as application. Sexp syntax was also valid as parenthesized expressions.

Well, when I played with a minimal lambda-calculus evaluator, I quickly got tired of having to print "\" or whatever to denote the start of a lambda. But look, for example, at this:

    (f. (x. f (v. x x v)) (x. f (v. x x v)))
        fct. n.
            (church0? n)
            (_. church1)
             _. church* n (fct (church- n 1))
Are "\" really needed there? You have to put lots parens to denote where lambdas end anyway, and they also happen to show where they start too, and a dot is a clear enough signal for a human (IMHO) to see that it's a lambda. So that's how I've came up with the idea of writing lambdas like that. Then I extended my lambda-calculus evaluator with some arithmetic primitives, and yes, the syntax got somewhat peculiar.
I haven't made the Show HN post yet, but using the parser combinator library that I've been building[1], here's an answer to your exercise:

  module Joker_vDChallengeParser
    include Parsby::Combinators
    extend self

    def parse(s)
      (expr < eof).parse(s)
    end

    define_combinator(:exp_op) {|left, right| group(left, spaced(lit("^")), right) }

    define_combinator(:pos_op) {|subexpr| group(lit("+") < ws, subexpr) }
    define_combinator(:neg_op) {|subexpr| group(lit("-") < ws, subexpr) }

    define_combinator(:div_op) {|left, right| group(left, spaced(lit("/")), right) }
    define_combinator(:mul_op) {|left, right| group(left, spaced(lit("*")), right) }

    define_combinator(:add_op) {|left, right| group(left, spaced(lit("+")), right) }
    define_combinator(:sub_op) {|left, right| group(left, spaced(lit("-")), right) }

    define_combinator(:call_op) {|left, right| group(left, ws_1, right) }

    define_combinator :identifier do
      first_char = char_in([*('a'..'z'), *('A'..'Z'), '_'].join)
      rest_char = first_char | char_in([*('0'..'9')].join)
      first_char + join(many(rest_char))
    end

    define_combinator :lambda_expr do
      group(
        sep_by_1(ws_1, identifier) < spaced(lit(".")),
        expr,
      )
    end

    def scope(x, &b)
      b.call x
    end

    define_combinator :expr do
      lazy do
        e = choice(
          decimal,
          lambda_expr,
          identifier,
          between(lit("("), lit(")"), expr),
        )

        # Each e = scope ... block is a precedence level. You can switch
        # them around to play with the precedence of the operators.
        #
        # hpe -- higher precedence level expression
        # spe -- same precedence level expression

        # Basing myself on Haskell and making function calls the highest
        # precedence operators.
        e = scope e do |hpe|
          reduce hpe do |left_result|
            choice(
              call_op(pure(left_result), hpe),
            )
          end
        end

        # Our only right-associative precedence level.
        e = scope e do |hpe|
          recursive do |spe|
            choice(
              exp_op(hpe, spe),
              hpe,
            )
          end
        end

        e = scope e do |hpe|
          recursive do |spe|
            choice(
              neg_op(spe),
              pos_op(spe),
              hpe,
            )
          end
        end

        # Left-associativity done by parsing left operand bottom-up and
        # right operands top-down.
        e = scope e do |hpe|
          reduce hpe do |left_result|
            choice(
              mul_op(pure(left_result), hpe),
              div_op(pure(left_result), hpe),
            )
          end
        end

        e = scope e do |hpe|
          reduce hpe do |left_result|
            choice(
              add_op(pure(left_result), hpe),
              sub_op(pure(left_result), hpe),
            )
          end
        end
      end
    end
  end

That was a nice exercise. Here's some example parses:

  pry(main)> Joker_vDChallengeParser.parse "- 3 - foo bar . foo - bar - 5"                                     
  => [["-", 3], "-", [["foo", "bar"], [["foo", "-", "bar"], "-", 5]]]
  pry(main)> Joker_vDChallengeParser.parse "- 3 - (foo bar . foo - bar - 5) - 4 * -2 + 5 / 2"                 
  => [[[["-", 3], "-", [["foo", "bar"], [["foo", "-", "bar"], "-", 5]]], "-", [4, "*", ["-", 2]]], "+", [5, "/", 2]]
  pry(main)> Joker_vDChallengeParser.parse "- 3 - (foo bar . foo 5 6 - bar - 5) - 4 * -2 + 5 / 2"             
  => [[[["-", 3], "-", [["foo", "bar"], [[[["foo", " ", 5], " ", 6], "-", "bar"], "-", 5]]], "-", [4, "*", ["-", 2]]], "+", [5, "/", 2]]

  pry(main)> Joker_vDChallengeParser.parse "foo bar . foo bar baz qux . baz qux"                              
  => [["foo", "bar"], [["foo", "bar", "baz", "qux"], ["baz", " ", "qux"]]]
  pry(main)> Joker_vDChallengeParser.parse "foo bar . foo bar + baz qux . baz qux"                            
  => [["foo", "bar"], [["foo", " ", "bar"], "+", [["baz", "qux"], ["baz", " ", "qux"]]]]

You can try it by cloning my repo, running `./bin/console`, and pasting the module, or putting it in a file in `lib/`. If you put it in a file, you can reload changes with `reload!` in the console.

[1] https://github.com/jolmg/parsby

EDIT: Fixed the syntax for newer versions of Ruby.

Looking back at it, I forgot to allow spacing between parentheses and inner expression

  between(lit("("), lit(")"), spaced(expr)),
and using space as an operator prevents `(foo)(bar)` from being parsed as a call.

This is better, just adjacent expressions with optional in-between whitespace:

  define_combinator(:call_op) {|left, right| group(left < ws, right) }
Too late to edit, though. Oh well.