| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cdcarter 1114 days ago

I agree on the suggestion to do part two, it's where things get really fun!

One thing you can do with the finished Lox (or Monkey, if you prefer WACIG) before going into the world of intermediate representations is implementing a peephole optimizer. You look for reducible patterns in the bytecode, and replace them with optimized bytecode. You can also look for certain patterns and replace naive implementations with native builtins/intrinsics. You can work with the raw bytes of the bytecode, so you don't need to introduce an IR just yet.

The Apex compiler at Salesforce does a vast majority of its optimizations as peeps.

EDIT: I _just_ wrote another comment to someone asking similar questions a few days ago. Here's a link to the parent question, check out my thoughts as well as others in the thread. https://news.ycombinator.com/item?id=36119915

2 comments

lukemerrick 1113 days ago

Wow, the whole concept of a peephole optimizer is a bit mind blowing to me. I'm appreciating all the reasons to power through to writing a bytecode VM as the next step.

I'm not sure how far down the compiler I actually will enjoy going vs. exploring ideas around type systems, linters, etc. up near the AST level, but if I do venture down this advice will certainly come in handy!

link

erichocean 1113 days ago

Do you have any pointers to papers/books exploring peephole optimizers? I've read a few paper, but found them underwhelming…

link

cdcarter 1113 days ago

I don't have an academic background, and I'll agree that the majority of books I have picked up have covered the subject very well. It's a pretty common technique though, for assembly and for bytecode, so I've learned by reading implementations. `peephole.c` in CPython is particularly easy to read with a small understanding of the CPython API. It's a very limited implementation, but the idea goes far. Lots of things that you might expect the compiler to optimize in IR can be done directly from the bytecode instead.

link