Hacker News new | ask | show | jobs
by JonChesterfield 889 days ago
What're you thinking of doing with the preprocessor? Accept the complexity and build that too, run a pre-existing one, implement a subset of it, other...
2 comments

CPP needs to run after lexing, and integer constant expressions need to be parsed and interpreted for #if. So I'm trying to implement my own since I'm already doing lexing/parsing/interpreting. Implementing everything end-to-end also seems like the only way to output decent error messages.
IIRC C preprocessor is not very hard to implement according to the specification if you don't worry too much about performance.
The C preprocessor is hilariously underspecified in the standard, so implementing the standard doesn't guarantee that you'll be able to handle real-world C programs (even ones that don't use GNU or clang extensions).
K&R preprocessor was indeed underspecified and allowed lots of variations---much of those issues can be seen in the GCC manual [1]---, but the current ISO C is much better at that job AFAIK. I think `## __VA_ARGS__` is the only popular preprocessor extension [2] at this moment, as the standard replacement (`__VA_OPT__`) is still very new.

[1] https://gcc.gnu.org/onlinedocs/gcc/Incompatibilities.html

[2] Assuming that we don't count things like `#pragma` or `#include_next`, which can be added without affecting other preprocessing jobs.

Yes, consider the case of shecc. It requires just a handful of C code lines to interpret directives set in the C preprocessor. Unlike relying on existing tools like cpp, as, or ld, shecc stands alone as a minimalist cross-compiler. This design could be particularly beneficial for students delving into the study of compiler construction. See https://github.com/sysprog21/shecc/blob/master/src/lexer.c#L...
I largely meant a standard-complaint implementation though, which shecc doesn't claim to be. ;-) In comparison I can easily see that this lexer is not suitable for preprocessor because C requires a superset of numeral tokens [1] during the preprocessing phase.

[1] https://en.cppreference.com/w/c/language/translation_phases#...