Hacker News new | ask | show | jobs
by stevekemp 2224 days ago
As a concrete example I wrote a simple interpreter a while back, in golang. Operations would be compiled to a simple bytecode which would be executed at runtime. So:

     3 + 4
Would get "compiled" to:

     PUSH_INT 3
     PUSH_INT 4
     OP_PLUS
As you can imagine this was a virtual stack-machine. After a while I started folding these operations:

* If there was a "push int"

* And another "push int"

* And a maths operation.

* Then replace.

So in my case I'd first generate the naive version, then have a separate "optimise" pass which would rewrite the program:

     PUSH_INT 7
     NOP
     NOP
     NOP
     NOP
A later optimization pass would remove consecutive NOP operations. I made a brief blog-post here:

https://blog.steve.fi/adventures_optimizing_a_bytecode_based...

There were a couple of bugs found along the way, but the actual implementation of this optimization pass was pretty simple:

https://github.com/skx/evalfilter/pull/66/files

The biggest issue I had to face was that "1 + 3 => 4" is trivial, but my virtual machine only supports loading integers. So I couldn't collapse "2 * 3.3" into the constant "6.6".

1 comments

For anyone wondering, the NOPs are so that jumps don't get messed up. Assembly level GOTOs generally work on byte/word offsets, so if something intended to jump back before the calculation but was expecting two PUSH_INTs and an OP_PLUS (5 words: 3 opcodes, 2 arguments) but there was only a single PUSH_INT (2 words), it would jump back 3 words too far.
Good clarification, and exactly right!

Later I do walk back over the bytecode and remove the nops. But I have to update all JMP/CALL instructions to cope with the changed destination offsets. Not a hard job in my case, as there are only a couple of instructions which refer to byte-offsets.

(If I allowed "LD A,[BC]", or similar permuations I'd have a lot more work to do.)