Hacker News new | ask | show | jobs
by afraca 2654 days ago
A prominent project where I have encountered this is GHC, the Haskell compiler. The route there goes: Haskell --> Core --> STG --> C--

(It then goes to one of: - Machine code generation - into C for feeding to gcc - into LLVM's intermediate representation. )

2 comments

For those who are not Haskell experts, Core is a small lambda-calculus with case (pattern matching), let and coercions, see [1] for details. STG stands for Spineless-Tagless G-machine and is a refinement of the older G-machine. The G-machine (short for Graph-Reduction Machine) is for the lazy (call-by-need) evaluation of functional programs in supercombinator form [4]. Instead of interpreting supercombinators as rewrite rules, they were compiled into sequential code with special instructions for graph manipulation. See [2].

For a general overview of GHC's architecture, see [3].

[1] A. Tolmach, T. Chevalier and the GHC Team, An External Representation for the GHC Core Language (For GHC 6.10). https://downloads.haskell.org/~ghc/6.12.2/docs/core.pdf

[2] S. Peyton Jones, Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine. https://www.microsoft.com/en-us/research/wp-content/uploads/...

[3] S. Marlow, S. Peyton-Jones, The Glasgow Haskell Compiler. https://www.aosabook.org/en/ghc.html

[4] https://wiki.haskell.org/Super_combinator

Would converting C-- to C/LLVM IR be redundant?
It's true that they are mostly on the same level but it's way easier to compile C-- to LLVM IR or C than to write three separate STG->(Imperative code) stages.