Hacker News new | ask | show | jobs
by synthetigram 1169 days ago
What isn't clear to me is why tail calls need to be implemented in WASM, rather than in the compiler? The post linked to Josh Habermans post on tail calls, which show how tail calls can help the compiler decide where to inline (cool!). But that was needed for the C++ text, not the LLVM code. It feels like tail calls are too high level of a concept to be in an "assembly" language.
7 comments

If a function calls itself using tail-recursion, a compiler can turn that into a loop without too much trouble. However, if it's tail-calling a different function then that becomes more difficult; it would have to merge the two functions into a single WASM function. And if the tail-call is indirect (through a function pointer) then it is impossible to turn into a loop.

> It feels like tail calls are too high level of a concept to be in an "assembly" language.

WASM is a bit higher level than typical assembly languages. It doesn't have unrestricted "goto", so there's no way to implement tail calls optimization the "hard way".

> It feels like tail calls are too high level of a concept to be in an "assembly" language.

In x86, the difference is between `call some_fn` and `jmp some_fn` (perhaps after a `call _guard_check_icall` to implement Control Flow Guard)

In WASM, the difference is between `call[_indirect] some_fn` and `return_call[_indirect] some_fn`, which is just `jmp some_fn` with a funny name - and similar control flow integrity constraints as imposed by CFG. Why not just use `jmp some_fn`? Because WASM lacks a generic `jmp some_fn`, as they've baked https://webassembly.org/docs/security/#control-flow-integrit... into the arch, which seems fair enough.

To add to other reasons, my opinion is: Webassembly right now is an extremely easy target for compilation. You can create your own compiler absolutely easily and get plenty of V8 optimizations for free. It's like LLVM, but probably much more accessible for hobby projects. You just parse text, build AST and dump AST to WebAssembly.

If WebAssembly would implement tail calls, implementing them in an original language requires zero efforts, they would just work.

But implementing tail calls using some kind of optimization, like function rewrite to loop is far from easy.

So my personal hope is that Webassembly would include features that are possible but very hard to develop in a hobby project. GC is another example. Even primitive toy GC is a serious project.

Reference counting is a toy GC algorithm, quite simple to implement.

Making it perform as fast as tracing GC algorithms, now that is a serious project.

WASM has structured control flow, with stack frame maintained by runtime. You cannot just jump anywhere.
Because the general tail call problem isn't self recursion - that is indeed trivial to flatten, but co-recursion, or recursion when you don't know the target.

co-recursion (f calls g calls h calls f, etc) requires you create a single function containing the bodies of all functions that will be called via tail recursion, put them all in a loop, and have a single stack frame capable of holding all the independent frames concurrently. It's doable, but taking my dumb example you'd need to do this for each of f, g, and h, or you have to convert those functions into wrapper functions for your one giant mega function. The mega function then has issues if it has non-tail recursive calls that re-enter as its own frame is large. The stack frame is large because WASM is structured: the bytecode can't just treat the stack as a general purpose blob of memory.

Dynamic recursion is a case where ahead of time you quite simply cannot have compile time flattening, e.g.

    int f(int (*g)()) {
      return g();
    }
Languages that directly target machine code can do this because they just have to rebuild the stack frame at the point of the call and perform a jump rather than a call/bl instruction. WASM bytecode can't do that, as it needs to be safe: the stack is structured and unrestricted jumps aren't an option.

Now it's not uncommon for the various JS/WASM engines to perform tail call optimizations anyway, but the important things that many languages need is guaranteed tail calls, e.g. a way to say "hello runtime, please put the work into making this a tail call even if you haven't decided the function is hot enough to warrant optimizations, as I require TCO for correctness".

For example lets imagine this silly factorial function:

    function factorial(n, accumulator = 1) {
      if (n <= 1) return accumulator;
      return factorial(n - 1, accumulator * n);
    }
in the absence of guaranteed TCO, this might fail for a large enough argument n. If the function is called enough a JIT might choose to run some optimization passes that perform TCO and then this works for any value of n. So for correctness it is necessary to be able to guarantee TCO will occur. In wasm that's apparently [going to be?] a annotated call instruction, which is what .net's vm does as well. The JS TCO proposal a few years back simply said something like "if a return's expression is a function call, the call must be tail recursive".
WASM is not really an assembly language. Before this, WASM didn't have jump at all, and so tail calls are adding a form of jump (jump with arguments). This makes WASM a much better compilation target.
It doesn't and I hope whoever is in charge of web assembly doesn't ruin what they have by giving in to nonsense trends that are part of a silver bullet syndrome and don't offer real utility.
> giving in to nonsense trends

Tail-calls have been pretty standard since at least the 'Lambda the Ultimate GOTO' paper from 1977 https://en.wikisource.org/wiki/Lambda_Papers

For example, it's done by all of the Schemes, all of the MLs (SML, Ocaml, Haskell, Rust, Scala, Idris, etc.), scripting languages (Lua, Elm, Perl, Tcl, etc.), and many others.

> that are part of a silver bullet syndrome and don't offer real utility

Their "real utility" is right there in the subtitle of that original paper:

  Debunking the "Expensive Procedure Call" Myth
  or, Procedure Call Implementations Considered Harmful
  or, Lambda: The Ultimate GOTO
In other words, any implementation of procedure/function/method-calls which doesn't eliminate tail-calls is defective (slow, memory-hungry, stack-unsafe, etc.)
Nitpick: Rust doesn't do tail call optimization (unless llvm decides to unprompted, iirc).

https://github.com/rust-lang/rfcs/issues/2691

Tail-calls have been pretty standard

Not standard, since almost no programming is actually done using tail calls. Programming is done with loops, tail calls are an exotic and niche way of working.

Even in lua and rust tail calls are rarely used and the other languages you listed are extremely niche.

Debunking the "Expensive Procedure Call" Myth or, Procedure Call Implementations Considered Harmful or, Lambda: The Ultimate GOTO

These are not explanations of utility, these are titles that you are using to claim something without backing it up.

In other words, any implementation of procedure/function/method-calls which doesn't eliminate tail-calls is defective (slow, memory-hungry, stack-unsafe, etc.)

This is a very bold claim with no evidence for it and pretty much the entire history of programming against it.

> These are not explanations of utility, these are titles that you are using to claim something without backing it up.

That wasn't 'me claiming something', it was Guy L Steele Jr., who's worked on such "niche" languages as Java, Fortran and ECMAScript: https://en.wikipedia.org/wiki/Guy_L._Steele_Jr.

Also, it was literally the title of a paper. If citing the literature with a hyperlink to wikisource doesn't count as "backing it up", then I have no idea where you put the goalposts.

> almost no programming is actually done using tail calls

Literally every function/procedure/method/subroutine/etc. has at least one tail position (branching allows more than one). It's pretty bold to claim that there are 'almost no' function calls in those positions. I wouldn't believe this claim without seeing some sort of statistics.

> Programming is done with loops, tail calls are an exotic and niche way of working.

Loops have limited expressiveness; e.g. they don't compose, they break encapsulation, etc. Hence most (all?) programs utilise some form of function/method/procedure/subroutine/GOTO. Tail-calls are simply a sub-set of the latter which, it turns out, are more powerful and expressive than loops (as an obvious example: machine-code doesn't have loops, since it's enough to have GOTO (AKA tail-calls)).

That wasn't 'me claiming something', it was Guy L Steele Jr.

You still just copy and pasted titles, this isn't evidence of anything.

If citing the literature

Then put in the part you think is evidence or significant. This is the classic "prove my point for me" routine. You're the one who wants to change a standard.

Literally every function/procedure/method/subroutine/etc. has at least one tail position

Are you conflating general functions with tail call elimination?

Loops have limited expressiveness; e.g. they don't compose, they break encapsulation,

Why would that be true? How would looping through recursion change this?

Hence most (all?) programs utilise some form of function/method/procedure/subroutine/GOTO

What does this have to do with tail call optimizations? Web assembly has functions.

machine-code doesn't have loops,

Web assembly is not the same as machine code

I think overall you are thinking that making claims is the same as evidence. You haven't explained any core idea why tail call optimizations have any benefit in programming or web assembly. You basically just said a well known language creator put them in some languages. There is no explanation of what problem is being solved.

> Web assembly is not the same as machine code

Indeed; if web assembly allowed unrestrained GOTOs (like machine code) then compilers would already be able to do tail-call elimination.

---

> Are you conflating general functions with tail call elimination?

> What does this have to do with tail call optimizations?

> You haven't explained any core idea why tail call optimizations have any benefit

Sorry, I think there have been some crossed wires: I was mostly pointing out the absurdity of your statement that "almost no programming is actually done using tail calls" (when in fact, almost all programs will contain many tail-calls).

That's separate to the question of how tail-calls should be implemented: in particular, whether they should perform a GOTO (AKA tail-call elimination/optimisation); or, alternatively, whether they should allocate a stack frame, store a return address, then perform a GOTO, then later perform another GOTO to jump back, then deallocate that stack frame, etc.

> You haven't explained any core idea why tail call optimizations have any benefit in programming or web assembly

Based on my previous sentence, I would turn the question around: what benefit is gained from allocating unnecessary stack frames (which waste memory and cause stack overflows), performing redundant jumps (slowing down programs), etc.?

nonsense trends like an "assembly" language supporting jumps?
I will quote you from a different comment:

WASM is not really an assembly language.

One of the main reasons it's not really an assembly language is that it doesn't support jumps! This is fixing a defect!
This is fixing a defect!

Only according to people using niche languages that are already slower than javascript but not anyone designing, implementing and using webasm.

This would allow you to do things like compile other assembly languages to webassembly.