Both, the obvious example is the fibinatchi sequence f(x) = f(x-1) + f(x-2). There are lots of efficient ways of coding it but the most obvious O(n^2) approch is n depth and it can't be tail-call optimized.
Granted, you can write much more efficient code that can be tail call optimized, but there complex and less obvious. There are also languages which cache previous results so the native approch becomes O(n) with fairly elegant code that's still depth n.
PS: Anyway, tail call optimization only works when you can rewrite the code as a loop for more complex structures it's less useful.
You would then add any function you would tail call optimize into that function. Granted, with the right structure (inside > outside >... > inside) it can show up more than once on the stack but the same thing can happen despite tail call optimization.
Mini-interpreter implementations do not count, for performance reasons. You could have called a VM bytecode interpreter loop a "loop alternative to tail call" here.
Granted, you can write much more efficient code that can be tail call optimized, but there complex and less obvious. There are also languages which cache previous results so the native approch becomes O(n) with fairly elegant code that's still depth n.
PS: Anyway, tail call optimization only works when you can rewrite the code as a loop for more complex structures it's less useful.