|
|
|
|
|
by xoroshiro
3238 days ago
|
|
I don't understand why this still looks scary to me. A while back after learning regex (though only at a basic level) I thought, maybe I can make my own custom C preprocessor as an exercise. Perfect right? I get to choose the syntax AND the rules. But somehow I just can't write it. Nested ifdefs, defines, undefs mixing together too OP. Reading through this, it seems like there's a huge amount of work into real compilers. Front end, optimizer, backend, etc. I do appreciate it more, but as useful as compilers are, maybe I'll just leave it to the pros (: |
|
Don't ever do that! It really isn't that complicated at a basic level. Regex aren't the solution for parsing. What you're missing is the recursive nature of "context free languages"[1]. Formal language theory [2] was one of the big, early wins of Computer Science. There's a lot to it, but writing a simple recursive descent parser doesn't require any of it.
The big caveat is that parsing is only about 1/3 of the problem, the others as described in this article are optimization and code generation. And of these, optimization can be skipped entirely leaving only code generation, which can be naively done by walking the AST. If this all seems foriegn to you, there's a lot of really good info online nowadays ie people love writing about this subject.
Understanding this subject is very important in my opinion, this is one of the foundational things we do. Attaining a comfort level with compiler engineering is one the two or three things anyone can do to really "level up" as a software developer. Some other things are writing multi threaded servers in C, and 3d software rasterization. Again, my opinion here.
[1] https://stackoverflow.com/questions/559763/regular-vs-contex...
[2] https://en.wikipedia.org/wiki/Formal_language