Hacker News new | ask | show | jobs
by milani 1890 days ago
Can a Decent Compiler convert direct style to CPS under the hood so to have the best of both worlds? Let's say we return Result<Integer> for the lookup and then pattern match for error or result. A compiler would be able to create two callbacks based on the match arms and change the lookup interface to accept them.
4 comments

Yes. Check out the book "Compiling with Continuations" by Andrew Appel.

Not only can a compiler "convert direct style to CPS under the hood," it's one of the more efficient and effective ways to implement a compiler.

At a high level, CPS provides an intermediate form that's well suited to transformation into machine-oriented forms like machine code and bytecode, but is still high-level enough to be able to support both human and automated reasoning.

Also see the paper, "A Correspondence between Continuation Passing Style and Static Single Assignment Form"[1] which describes the equivalence between SSA, commonly used in imperative language compilers, and CPS.

This is one of the many cases where a well-motivated functional solution turns out to be one of the better solutions to a problem, to the point that a version of it is used even in imperative contexts.

[1] https://www.cs.purdue.edu/homes/suresh/502-Fall2008/papers/k...

In addition to those, Matt Might has an article that breaks down the steps as well: http://matt.might.net/articles/cps-conversion/
Cool, I'll check them out. If it is "one of the more efficient and effective ways to implement a compiler", are we aware of any non-functional languages that use this style in their compilers? Asking since most compilers use LLVM and LLVM is in SSA form.
The insight in the Kelsey paper I linked is that SSA and CPS are essentially equivalent. SSA is more or less CPS expressed at a kind of lower level, more like a bytecode. So one answer to the question is that all SSA compilers are already using a form of CPS.

If you wanted to use traditional CPS explicitly, you'd need a transformation step that converts the imperative source code into a more functional intermediate form so that it can be CPS'd. There might be benefits to that, since CPS can be easier to analyze than SSA. But I'm not aware of any real imperative compilers that do it.

There's been academic work on it, like: https://arxiv.org/abs/1202.3247

One of the potential benefits is covered in that paper: the ability to prove correctness of the transformations. That kind of thing is easier to do given the mathematical tractability of CPS, due to its connection to lambda calculus.

CPS is a tool in the box of language theorists and compiler authors (although I don't know if any non-prototype actually uses CPS). IIRC it is mostly a convenient alternative to SSA. So yes, a compiler can transform your code into CPS to make some other passes simpler.

Just an example from my own research: Say your program has the form of a long prelude followed by a tight loop. The loop cannot easily be changed, but the prelude can (think of languages like prolog, the prelude sets up the model, the loop solves it). At some point during the prelude you want to say: "For now, do this, but if x happens inside the loop, come back here, do that and then go back into the loop!" (e.g., your solver found a solution and you want to add/remove/change predicates). This is conveniently expressed as taking a continuation and passing it into the loop, which continues it when necessary.

Actually, programs in SSA form are a subset of programs in CPS [0]

[0] "A correspondence between continuation passing style and static single assignment form" https://www.cs.purdue.edu/homes/suresh/502-Fall2008/papers/k...

SML/NJ uses CPS.
And it's noteably slow due to it; c.p.s. made sense at that time, but modern hardware takes advantage of the existence of a stack much more aggressively.

Chicken Scheme also uses it with a novel approach that allows the C stack to become the garbage collector which is still relatively fast as far as I heard.

Tragically, this blog post is not using the words "continuation passing style" to refer to CPS. If you read it you will see that the author is talking about a "style" of code which involves passing multiple lambdas to a function.

Why he thinks it is a good idea to refer to this idiom as "continuation passing style" is anybody's guess. It is not CPS. His usage is confusing and to me it seems most likely that he just wants to be seen using a fancy ten-dollar technical term with the ring of prestige. Basically, the kind of blog post that wastes the time of unsuspecting readers.

You are mistaken. This is exactly continuation passing style. The two arguments `if-found` and `if-not-found` are continuations of `lookup`. Note that their invocations are in tail position within lookup.
Yes, the Kotlin compiler does this. Suspending functions are based in CPS.

https://resources.jetbrains.com/storage/products/kotlinconf2...

Yes, as others said.

What's more: you could also interpret the typical stack based execution in machine code as a special CPS style execution: only the order is strictly LIFO so that the continuations can be allocated on the stack. I.e., the continuation object for a function call is simply the function call's stack frame. Everything is there: 'contuation address' is in there: the return address, i.e., the 'ret' instruction, which is just a computed branch in the end, implements the invocation of the continuation, and it passes it the return value as a parameter. Sure, the compiler does need to be careful about the stack pointer manipulation, but the analogy is close enough. This can very easily be changed to a heap-based allocation of 'stack frames'. The compiler can also optimise and do some continuation allocations on the stack when it knows it will run out of lifetime after the call, or on the heap, just by the right setup of the frame pointer. And the compiler may even do tail call optimisations from this, so even with CPS, you get O(1) memory usage if the code is recognised to allow that.

Even more, if the heap is compressed, and a heap allocation is as easy as incrementing a pointer, then the stack pointer could be reinterpreted to point to the end of the continuation heap (careful with deallocations in this case, of course), so just a heap and no stack, the runtime system would not even need more pre-defined registers than a normal stack based execution and instructions that implicitly increment the stack pointer could be used to fill the next continuation object -- it would look exactly like constructing a stack frame.

It's all connected.