Hacker News new | ask | show | jobs
by TOGoS 392 days ago
I wrote a paper on REBOL back in college. It is very interesting, but the syntax is definitely weird. You might think of the function call syntax as being sort of Forth-like, but with the tokens in reverse order. So like a Lisp, but without required parentheses. e.g. in the example

  send friend@rebol.com read http://www.cnn.com
`read` knows that it takes one argument, and `send` knows that it takes two, so this ends up being grouped like

  (send friend@rebol.com (read http://www.cnn.com))
(which I think is valid syntax; that AST node is called a 'paren').

Weirdly, the language also has some infix operators, which seem a bit out-of-place to me. I have no idea how the 'parser'[1] works.

[1] 'parsing' happens so late that it feels funny to call it that. The thing that knows how to treat an array as a representation of an evaluatable expression and evaluate it.

3 comments

> Weirdly, the language also has some infix operators, which seem a bit out-of-place to me. I have no idea how the 'parser'[1] works.

There are no keywords or statements, only expressions. Square backets ("blocks") are used for both code and data, similar to a Lisp list. The main language (called the "'do' dialect") is entirely polish notation with a single exception for infix operators: Whenever a token is consumed, check the following token for an infix operator. If it is one, also immediately consume the immediately following one to evaluate the infix operator.

This results in a few oddities / small pitfalls, but it's very consistent:

* "2 + 2 * 2" = 8 because there is no order of operations, infix operators are simply evaluated as they're seen

* "length? name < 10" errors (if "name" isn't a number) because the infix operator "<" is evaluated first to create the argument to "length?"

From your brief description that is likely incomplete, it looks as if the length? function is treated as a prefix operator of low precedence relative to the infix operators. The infix operators are all at the same precedence level and have left-to-right associativity.

I made an infix parser in which certain prefix operators (named math functions) have a low precedence. This allows for things like

  1> log10 5 + 5   ;; i.e. log10 10
  1.0
But a different prefix operator, like unary minus, binds tighter:

  2> - 5 + 5
  0
I invented a dynamic precedence extension to Shunting Yard which allows this parse:

  3> log10 5 + 5 + log10 5 + 5    ;; i.e. (log10 5 + 5) + (log10 5 + 5)
  2.0
Functions not registered with the parser are subject to a phony infix treatment if their arguments look like they might be infix and thus something similar happens to your Red example:

  4> len "123" - 2
  ** -: invalid operands "123" 2
"123" - 2 turns into a single argument to len, which does not participate in the infix parsing at all. log10 does participate because it is formally registered as a prefix operator.

The following are also the result of the "phony infix" hack:

  4> 1 cons 2
  (1 . 2)
  5> 1 cons 2 + 3
  (1 . 5)
Non-function in first place, function in second place leads to a swap: plus the arguments are analyzed for infix.
Not sure why you got downvoted, but Rebol is not a Lisp. It doesn't work because of precedence rules or special rules, but because arguments accumulate until there are enough to eval the previous function in the stack, so you can do stuff like

  print tostring 5 + cos pi
Works a bit like a shift/reduce parser, with heavy use of fexprs (blocks in Rebol parlance)

I know you enjoy Lisps, so you might like this toy Rebol evaluator written in Scheme: http://ll1.ai.mit.edu/marshall.html

I'm guessing because their understanding is still wrong:

  3> log10 5 + 5 + log10 5 + 5    ;; i.e. (log10 5 + 5) + (log10 5 + 5)
  2.0
In Rebol this would be equivalent to (log10 (5 + (5 + (log10 (5 + 5)))))
> I have no idea how the 'parser'[1] works

I think parsing there depends on the actual value of the current token. So if you assign send to another variable and use that the "parser" will still recognize that it takes 2 parameters.

It's an interesting design, definitely not something one sees frequently.

but why, don't get this design choice at all.
It takes the common "function(arg1, arg2)" pattern and turns (almost) the whole language into a very simple/consistent https://en.wikipedia.org/wiki/Polish_notation

Even things that are normally keywords and statements in other languages (like conditionals and loops) are actually just functions that conform to the exact same parsing rules.