Hacker News new | ask | show | jobs
by andybest 2991 days ago
Will be interesting to see how compiling to C affects debugging in the real world. LLVM is nice as a target, since it allows you to map generated IR blocks to source locations. Of course, it also means that the compiler needs to link to all of the LLVM libs, which is a bit of a hassle too. (I've found this when working on a Clojure->LLVM compiler bootstrapped from Clojure).
1 comments

You can map locations in generated C code to locations in some other file, too. Standard C has a pragma for it:

    #line <num> "filename"
or, if the filename is the same as the previous pragma, you can omit it from the pragma:

    #line <num>
This pragma is used to inform the compiler of what information it should use when displaying error messages and such. I think it's meant to be used by the C preprocessor, but the preprocessor isn't required to use it, and other programs aren't prohibited from using it.

You might wind up generating a lot of these, but they're really easy to generate as long as you preserve that information all the way through compilation (which I'd assume you'd have to do for LLVM IR regardless).

Interesting, I wasn't aware of the #line directive, thanks!

Yes, you would need to keep track of the source locations for each node in the AST. There is a nice example in the LLVM Kaleidoscope tutorial here: https://llvm.org/docs/tutorial/LangImpl09.html

Bison and lex use #line to make generated C/C++ code (lexer, parser) debuggable, or at least their GNU implementation does, I don't know about other implementations.