You would then add any function you would tail call optimize into that function. Granted, with the right structure (inside > outside >... > inside) it can show up more than once on the stack but the same thing can happen despite tail call optimization.
Mini-interpreter implementations do not count, for performance reasons. You could have called a VM bytecode interpreter loop a "loop alternative to tail call" here.
My first example of (f x) is about as fast as you can get.
The longer example is not great the point is it's a very language agnostic tradeoff of speed vs memory. Replace the ‘if’ conditionals with a case statement and its much faster you can speed it up further by using continue.
You can speed it up even more by having a jump at the end of each statement, but that stops looking like a loop and is basically just the tail call optimization.
Anyway, the important part is its low memory utilization and compiler independent.
PS: I in no way suggest tail call optimization was not useful; just you can mechanically simulate it when not available. You can usually beat that mechanical approach if you need to speed things up further.
> just you can mechanically simulate it when not available.
It's too slow to ever be anywhere near practical - for this reason, almost no JVM language implementation really does anything like this, besides kawa, bigloo and alike, which are slower than some of the dumb interpreters like SISC.
I don't think I was clear enough, it simplifies to zero code (not even a jump) just the original input. (I thought about saying noop but even that's wasteful.)
>"it's slow"
Again, it's just a technique; think writing embedded code with a really dumb compiler. Anyway, saying its slow is not really a counter argument if your limited to say 2,000 bytes of ram you will make lot's of tradeoffs between efficiency and speed. Perhaps you have a select statement perhaps you don't, but starting off with pure ASM is often a pain.
EDIT: This is all from the perspective of dealing with tools that don't do tail call optimization, not writing a compiler / VM etc.
Though the mechanical equivalent would be something like:
You would then add any function you would tail call optimize into that function. Granted, with the right structure (inside > outside >... > inside) it can show up more than once on the stack but the same thing can happen despite tail call optimization.