Hacker News new | ask | show | jobs
by icarot 4235 days ago
I'm so confused by this. His solutions end up using recursion, yet he says it's stack-free. Which doesn't make any sense at all considering deficient compilers allocate stack frames on each call for languages like C. This is all chapter 1 SICP-grade stuff. Unless you're going to do a technique (read: hack) such as trampolining (certainly not done here) I don't see how it's possible to have recursion without a stack. The runtime should keep pushing frames. Notice there's still a pointer defined in the Hanoi().

I don't think this can be computed on a register machine. What do you girls/guys think?

4 comments

The title ("Stack Free Recursion") is certainly misleading, as are the labels of several examples ("Stackless Factorial", "This Form of Routine Can Be Changed to Stack Free Recursion", the function stackfree(x), "Stackless Towers of Hanoi"), in that all these examples, and all but one example in the paper, implicitly use a stack for return addresses.

However, in the abstract, the definition of the bet, and throughout much of the article, the author does distinguish between the concepts of a "data stack" and a "return stack", and he illustrates one case in which the return stack is in fact eliminated entirely (and says, at the bottom, that it is "sometimes possible" to remove the return stack). Overall, 19/48 instances of the string "stack" are preceded by either "data" or "return".

Is it fair to use "stackless" to mean "data-stack-less", without saying so explicitly? I would say no. The stated advantage is that less memory is used (and that the result "sometimes is faster", presumably due to fewer push/pop operations). However, "stackless" implies that you've eliminated the problem completely, when you've achieved less, perhaps significantly less than that. For example, in the case of factorial, the naive approach stores two CPU words per level of recursion (the value of x and the return address), and the (data-)stackless approach stores one CPU word per recurse (the return address); this fixes half the problem.

Also, see this in the abstract: "We also present a method that uses no return stack or data stack and we derive a simple line drawing function using the information presented herein." This wording strongly suggests that the line drawing function doesn't use a return stack, when it does. A comma after "data stack" would make it a little less non-obvious that the two statements in the sentence are not related to each other. I would have to call this paper "oversold".

Definitely agree on the "oversold".

I notice you said "half the problem" at the end of your third paragraph there. I'm assuming the other half is that it seems foolish to save the return address for a recurse. Only the first call should save a return. I think one problem with compiler writers is that oddly they don't separate how to treat a procedure and its recursions as different cases. They just fit every procedure into the same bucket-list algorithm of 1. allocate frames 2. allocate return address.

Hanoi is not the best example of a problem calling for recursion, as the following iterative solution shows:

    max = 1 << no_of_discs;
    for (x = 1; x < max; x++)
        printf("move a disc from %d to %d\n", (x&x-1)%3, ((x|x-1)+1)%3);
You're right. The de-facto way to prove an algorithm as stack-free is run it on a register machine, having no stack to begin with. As you suggest, it'd also be optimal to use a problem which calls for deep recursion.

Hence the task, should ye choose to accept, is compute the Ackermann function as an iterative process (i.e, with state variables in the recursion keeping track of which step at the process you are at). The following is the easy, non-iterative process:

(define (A x y) (cond ((= y 0) 0) ((= x 0) (* 2 y)) ((= y 1) 2) (else (A (- x 1) (A x (- y 1))))))

space is O(n) to record past results in one of the two subtrees of this tree recursion at any given time, which must both be evaluated separately, and at separate times if evaluation is deterministic (not concurrent). So if you were to recast it iteratively, you'd hold an immaculate benchmark of an abstrusely deep recursion which you could run on a register machine. I'm not sure if anyone has done this.

> I don't see how it's possible to have recursion without a stack.

Well, it depends what you mean by "without a stack", and it's also the case that for some recursive algorithms, you can make them stackless.

Any recursive algorithm is isomorphic to a non-recursive algorithm and a separate stack (like a linked list), but that's still "with a stack".

Many recursive algorithms are tail-call recursive, which means that you can reduce the algorithm to a fixed-space loop. For example,

    int fac'(int accum, int n){
        if(n==0) return accum;
        else return fac'(accum*n, n-1);
    }
    int fac(int n){return fac'(1,n);}
is equivalent to a loop

    int fac(int n){
        int accum = 1;
        while(n>0){
            accum *= n;
            n--;
        }
        return accum;
    }
Compilers like GCC and Clang will make this optimization.

Other (co)recursive algorithms are WHNF tail-call recursive, meaning that the algorithm can be expressed as a function that returns a partially evaluated value and another function to evaluate the next part of the value. This also requires constant space, but you won't find it in a whole lot of languages. It's popular in Haskell, for example:

    fibs a b = a : fibs b (a+b)
    fib n = fibs 0 1 !! n
Even though "fibs" (the function that generates the list of Fibonacci numbers) is recursive, and, in fact, infinitely recursive, it requires constant space.

You may also be interested in reading http://wwwhome.ewi.utwente.nl/~fokkinga/mmf91m.pdf , which explains a number of morphisms that might be leveraged to transform certain recursive functions into constant-space variants.

Thanks so much for that paper. That'll save me time thinking about transforming functions on my own.

Are you sure they make that optimization? How do you know? (I'm not familiar with either code base).

Are you absolutely sure that example is O(1)? It looks O(n) to me. I'm not experienced in Haskell, by the way: disclaimer.

Each call to 'fibs' constructs a list with its first argument and the recursive call. But for all of the recursions (aka, 2nd element and on) I think they'll have to build up deferred cons operations. I'm not sure about this at all. I sound suspicious, I know, but before you say anything, look here: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-11.html...

don't they look suspiciously similiar? In that example, you'd have needed to store a function pointer and the argument'(s) type(s) for each call, on the stack, making space usage O(n). Each call pushes to the stack.

Does the Haskell example seem similar to you? I'm really uncertain.

>Are you sure they make that optimization? How do you know?

Yeah, if you write a tail-call recursive fib function and turn on optimizations, the optimizer should definitely loop-ify it.

I tested this a while back on my machine. Writing C and compiling with LLVM yielded a 5-instruction loop, and writing tail-call recursive, strict (not WHNF-recursive) Haskell with optimizations on yielded a 4-instruction loop. Both were very fast. You can look at the generated assembly with a variety of tools. I used Apple's profiler tools.

>Each call to 'fibs' constructs a list with its first argument and the recursive call. But for all of the recursions (aka, 2nd element and on) I think they'll have to build up deferred cons operations.

Correct. If you saved the list, it would be O(n). However, since you throw away the head of the list and move on to the next element immediately (if you want the nth fib number), the garbage collector will get rid of all the generated list elements, so the space consumption is only the list element you want (or are looking at currently), so it's O(1).

It might look like this, in pseudo-code.

    def fibn(n){
        (curr, nextGenerator)=fibs(0, 1);
        for(i=0; i<n; i++){
            (curr, nextGenerator) = nextGenerator();
        }
        return curr;
    }
Since the rest of the list is passed as an unevaluated function, you only need to keep track of two things at once: the head of the list (in case it's the correct element) and the function to make the rest of the list.

Also, Haskell doesn't really use the stack like a lot of other languages do. That's why you can have infinitely recursive functions without running out of space. It accomplishes this using the technique I put in that loop up there: passing values around as unevaluated functions.

(Side note: Haskell doesn't keep track of types at runtime. The compiler erases any type information before compilation, and tries to eliminate any dynamic function lookup. As long as the types match up during compile time, they will match up during runtime.)

> I don't see how it's possible to have recursion without a stack.

Recursion can happen without a stack as long as function calls can happen without a stack. It's well-known how to achieve this. See Tail Call Optimization, for example.

That's why I said 'deficient' compilers. Yes, it is well-known. Just reuse the stack frames on the next call; dead simple.