Hacker News new | ask | show | jobs
by dahart 2954 days ago
Yes that, or code the recursive solution in a heap-allocated space rather than on the stack. It somehow seems easier, safer, and more explicit and controllable to me to adjust the code & data to use manual recursion with backtracking than try to adjust system stack limits. Often it consumes a lot less memory too, since you have more control over what your "stack frame" looks like; you don't have to store all your local variables for every step.
2 comments

I've tried to compare recursive tree DFS and iterative one (with stack in Python list) implemented in Python - recursive was slightly faster. I've not compared memory usage thought. Function call has overhead and theoretically recursive approach should be slower, but list manipulation (append, pop) in Python is also not fast. In compiled languages results can differ (iterative probably will be faster).
Yeah, that’s true. I would expect manual recursion in Python using append/pop on a list based stack to be slower than native recursion. You might try pre-allocating your entire stack in a list or numpy array, and not using append & pop during the recursion at all. I might expect that to be faster than native recursion.
Any examples of code for this approach? From what I guess, you are implementing some kind of assembly like approach with explicit saving of stack frame, but I am having a hard time imagining it as being easier.
Not OP, but they probably mean something like the following toy problem (in C++):

  struct BinTree {
    long value;
    BinTree *left;
    BinTree *right;
  };

  long long RecursiveDFSSum(BinTree *node) {
    if (NULL == node) {
      return 0;
    }
    return (long long)value + RecursiveDFSSum(node->left) + RecursiveDFSSum(node->right);
  }

  long long IterativeDFSSum(BinTree *tree) {
    std::vector<BinTree *> custom_stack;
    custom_stack.push_back(tree);
    long long value = 0;
    while (!custom_stack.empty()) {
      BinTree *node = *custom_stack.rbegin();
      custom_stack.pop_back();
      if (NULL != node) {
        value += node->value;
        custom_stack.push_back(node->left);
        custom_stack.push_back(node->right);
      }
    }
    return value;
  }
Did not check this actually compiles etc. but you get the point. Both ways will give you the same solution in the same way, time complexity, space complexity etc. but the second one is not bound by max call stack size (only by max heap size). Additionally, the second one is slightly more space efficient, since the recursive solution requires saving an entire call frame into the stack (e.g. stack pointer, return address) whereas the iterative solution just stores one pointer per stack item.
This is exactly right; thank you for providing the example. This is manual recursion in a heap allocated space.

If you were coding a DP problem, then custom_stack might have a fixed size you can pre-allocate, and it might also be 2 or 3-dimensional.

For some image-based recursion, your backtracking doesn’t even need to store real pointers in the stack frame. For example when I’ve written a flood fill, I can store the return pointer as a one pixel offset in as little as 2 or 3 bits, depending on whether I include diagonal pixels (4-surround vs 8-surround).

Thanks! I was getting confused between stack data structure, function stack and stack space and had gotten to a weird mix in my head. The example cleared it up.
I think the OP might be referring to allocating and managing your own stack. You could use a list in Python for this.

Python memory management is automatic though, discussing stack vs heap doesn't make sense in the context of Python,well for CPython at least. I'm not sure about other Python implementations.

> Python memory management is automatic though, discussing stack vs heap doesn't make sense in the context of Python,well for CPython at least. I'm not sure about other Python implementations.

I'm not sure I know what you mean about stack vs heap not making sense because of Python's memory manager. Will you elaborate?

The primary issue is that the default stack limit in Python is too small for some applications of recursion, which the comment before mine illustrates. This is true not just in Python, but any language, since the stack size is generally a function of the process or OS, not a limit of the language. Heap allocated stacks are a reasonable thing to do in any language.

You're right you can solve that in Python by using a list. Python's memory management doesn't really affect one's ability to do so, right?

A secondary issue is that native recursion sometimes uses more memory than a manually heap-allocated "stack". If I make my own stack, I have complete control and complete understand of what's in memory and how much I use. With the native stack, it can be very opaque, and it's easy to chew up the already-too-small stack very quickly by accidentally having a large stack frame.

>"I'm not sure I know what you mean about stack vs heap not making sense because of Python's memory manager. Will you elaborate?"

In Python everything is an object. Python gives you a reference to that object when you create it. There is no way to tell Python(CPython anyway) in which memory space you would it to create that object.

> There is no way to tell Python(CPython anyway) in which memory space you would it to create that object.

Ah right, that's because all objects are heap-allocated.

You choose heap by using an object for the stack, and rewriting your recursion to use (superficially) iterative code.

You can choose stack allocation instead by using regular recursion: native function calls with local variables.

What you bring up is an interesting issue that can make recursion harder to understand in Python. Having local objects in the stack frame can cause both stack and heap allocation - pointers for the objects on the stack, and the object contents on the heap. Or, you might have global objects that aren't local to the recursive function call or the stack, in which case it's important to understand you're sharing data across function calls.

Generally speaking, you probably don't want individual heap allocations in a recursive function, so it's best not to have local objects. At least performance-wise.

>"You choose heap by using an object for the stack, and rewriting recursion using (superficially) iterative code.

You can choose stack allocation instead by using regular recursion: native function calls with local variables."

I'm not following you.

I believe these would both result in the same thing in Python. Python gives you a reference to an object. That object is stored in a private heap "somewhere." That reference to the object that Python gave you is stored on the stack. The object that it points to lives in the heap. This should be the same for both of your examples. I'm not sure what you mean by "native function calls." I am not familiar this this term.