Hacker News new | ask | show | jobs
by peter303 3863 days ago
Forth programs are very compact. And so are Forth intepreters. Great for 1970s PCs with memories as small as 8K. But postfix programs are hard to understand. An hour later and you've forgotten what you have written.
2 comments

Postfix isn't hard to understand.

Postifix with no syntactic delimitation is hard to understand.

The problem with Forth is that although we undersatnd what each word does in isolation, when we look at a word in the middle of the program, we do not have an instant idea about what material to the left of the word produces its arguments and how, and what material to the right consumes its result. We have to go back and simulate the stack machine in our mind to work this out.

What does this do? w1 w2 w3 w4?

We have to know what w1 through w4 do to understand how their data flows are connected together! That's just wrong. In any sane language, we don't have to know what w4 exactly does in order to understand that, for instance, takes two inputs produced by w1 through w3.

The above words could be totally independent from each other, and produce four operands on the stack. Or it could be that they thread a value through them: each one transforms the top word on the stack.

The expressions in a nice, nested, functional notation give us the functional tree at a glance. We can achieve this while keeping the notation postfix, by adding parentheses:

   (w1 w2 w3)w4
Now w4 is a three-argument function. w1 through w3 take no arguments.

   ((w1)w2 w3)w4
Now w1 produces the argument value for w2. This is the left argument of w4, and w3 is the right argument.

Moreover, now we can reason about different evaluation orders, as long as we know there are no side effects. I can think about w3 and then w2 and w1, or vice versa.

Moreover, if w4 is declared as requiring exactly one argument, the error is now statically obvious.

Also great for myriads of devices with constrained memory and computing power we use every day in 2015.

> But postfix programs are hard to understand

No.

Badly written programs are hard to understand. That's true for every language. Also, programs written in a language you don't know are hard to understand. That's obvious, right? Postfix, prefix or infix notations have little to do with this.

There are natural languages written from right to left, and also ones written from right to left and vertically at that. Are they "hard to understand" for people who use them? What do you think?

The direction a language is written in has no bearing on how easy it is to understand. Postfix is a different matter altogether.
Wait, why? Could you elaborate? How is:

    + 1 2
fundamentally different from:

    1 2 +
? You just look for the operator on the other side of arguments.

I know both prefix, postfix and "crazy infix" (J, some APL) languages and I really don't see a qualitative difference. Of course, after you overcome the initial hurdle and get used to the notation.

EDIT: ok, J/K/APL are a special case and I shouldn't mention them.

That's too trivial an example. When you get something more complex, the difference becomes clearer.

Can you parse the following easily?

c_z x * s_z y * s_y * c_y z * + c_x * c_z y * s_z x * - s_x * - d_z =

Compare that to the infix form:

d_z = c_x * (c_y * z + s_y * (s_z * y + c_z * x)) - s_x * (c_z * y - s_z * x)

With infix, the operator sits between its operands, so you can easily see what expression makes up the first operand, and which makes up the second: you just look to its left and its right. Postfix, however, requires you to read the entire expression, because the operator doesn't show which operands it has, they're just whatever the last two were, and those last two might themselves be complex expressions. This gets worse the longer the total length of the expression.

Postfix also suffers from not making it clear how many operands a given operator takes. With infix this is always clear.

I like postfix languages for their simplicity, but I refuse to pretend they are as easy to read as infix languages.

(example was taken from https://en.wikipedia.org/wiki/3D_projection#Perspective_proj...)

Well, for me both examples are totally unreadable. I think math's is the most unreadable notation ever and even a cross between PERL and Brainfuck would be better. Mathematicians are masochists, and I refuse to follow their lead. I prefer meaningful variable names, good use of horizontal and vertical whitespace, context-free grammars, and the like. So maybe this is the difference and the reason for me perceiving the notations as more or less equivalent: I'm not biased, as most people, in favor of infix, but rather biased the other way.

Anyway, let's try doing something with your postfix example (I hope it displays alright; also, I think you made a mistake in your translation to postfix, but I left it as it is):

    c_z   x *
    s_z   y *
        s_y *
    c_y   z *
    +
        c_x *

    c_z   y *
    s_z   x *
    -
        s_x *
    -

        d_z =
Now, this is much more readable than unformatted infix version you give and that's before factoring this into smaller parts. I didn't read that much of Forth, but I'm 97% sure that it would be factored into 3 or 4 words, I think. And it also has an advantage that you read the operation in order they're going to be executed, while with infix you need to read the entire expression to know which computation occurs first.

Of course, it's only more readable for me, with my particular background knowledge and habits; I'm not saying this is or should be the same for anyone else. But, if there are people who see one form as more readable and people who see the other form as more readable, then I think that's a solid argument in favor of both forms not being drastically, qualitatively different.

The problems you point to are definitely real; it's true "the operator doesn't show which operands it has" by default (for example) and you need to go out of your way to show it. But infix notation has it's problems too, and you need to work around them as well. Like operator precedence, which is frankly a terrible idea.

> Postfix also suffers from not making it clear how many operands a given operator takes. With infix, this is always clear.

No, not always. J has words which take one argument on the left and, for example, two on the right. Or three. Or a variable number of arguments on the left (granted, they are `tied` with ` word, but still) and nothing on the right... And Ken Iverson says it's very readable! (To him, at least).

To summarize: I'm still not convinced that there is a major and unfixable difference in readability between the notations, and I still think you can make all 3 notations as readable as any other.

It's not the direction, it's the lack of delimiting.

  ((1 2 +) (3 4 +) *)
is fundamentally different from

  1 2 3 4 + + *
Each word has a clear arity, and the arguments are delimited. We can follow the evaluation of the parenthesized expression in multiple orders and they come up with the same answer.
If you take a look at my reply to TazeTSchnitzel you'll see that I "solved" (I think? at least tried to) this problem with indentation (generally whitespace. You can delimit expressions in many ways.

And in Forth case, you can very easily define your own delimiters:

    : [ ;
    : ] ;
which should make you able to write:

    [ [ 1 2 + ] [ 3 4 + ] * ] . 
    21  ok
(tested with gforth and works [EDIT: but of course breaks Forth! Both [ and ] are already defined, so in practice you'd rather choose another chars])

I mean, there is no rule saying that postfix notation cannot provide grouping constructs. I still fail to see a fundamental difference here :)

BTW, it's not going as fast as I'd like, but I managed to parse TXR man page and use it for displaying docs along auto-completion: https://github.com/piotrklibert/txr-mode/blob/master/screen....

I think I'll be able to find some time this weekend (or next weekend) to clean up the code and make it usable for others as well :)

Whitespace that is not significant to the machine does nothing to help me convince myself that the code is correct. Indentation could be wrong.

If I already know that the code is correct and properly indented, then it helps the readability.

> And in Forth case, you can very easily define your own delimiters:

Those delimiters do nothing but occupy interpreter cycles. Hopefully they get recognized as noops and optimized away by a Forth compiler.

The machine will accept garbage like:

  ] 3 2 + [ 4 /
The fake syntax you've created is there is sort of like a cargo cult airplane made out of bamboo sticks and palm leaves. It has some value as an annotation of correct code, that is all. It could be a useful annotation tool in the process of verifying a piece of code and convincing myself that it's correct. Forth should have these markers built-in so they don't have to be defined as words, and it should check their pairing and nesting. (A syntax highlighting engine can be taught to do that, of course.)