Hacker News new | ask | show | jobs
by azhenley 2224 days ago
I'm currently working on a 3-part blog series on writing a one-pass compiler in Python that emits C. It is quite a bit longer though because we make the lexer and parser from scratch. I started it because my students were having trouble with other tutorials because of all the jargon, theory, and libraries required.

http://web.eecs.utk.edu/~azh/blog/teenytinycompiler1.html

3 comments

Good text, but let’s nitpick. I’m too lazy/confident/arrogant to verify, so let’s potentially embarrass myself here.

    def getToken(self):
        self.skipWhitespace()
        self.skipComment()
How does that handle multiple consecutive comments? Whitespace following a comment?

(returning ‘comment’ and ‘whitespace’ tokens would fix this, and would make it possible to reuse the lexer for pretty-printing/syntax coloring)

It works because a comment by definition goes until a newline, so you can’t have consecutive comments without a newline token in between.

I’ll look into if there’s a better way to do it than this based on your suggestion, thanks!

I've done the same thing for the same reason but in JavaScript and compiling a lisp dialect.

https://notes.eatonphil.com/compiler-basics-lisp-to-assembly...

That's very well done, thanks for sharing. I really like the straight forward writing style, I've never attempted to write a compiler, and have only done basic parsers for day to day tasks, so like the clean, simple approach.
Very glad to hear it! Part 2 will be out in a few days and part 3 the next week.