|
|
|
|
|
by _RPM
3551 days ago
|
|
Interesting. One thing that I still haven't solved yet is the "break" and "continue" statement inside loops. For a break statement, it seems like it would just be the same a JMP with an address as the operand, but there would need to be some sort of registration of "The VM is in the loop now, and the break address is X", and continue would also be a JMP with an address to the top of the code for the loop. I haven't implemented those in my system yet, and also have no idea how Python or PHP does it. Is PHP's VM a stack based one? I do read the Zend/ directory of PHP's source, but it is really hard to follow and there is virtually no documentation on the VM |
|
That's relatively easy when implementing an assembler for your opcodes. Just keep track of symbolic labels and their associated jump points (as a simple array or linked list) and process (i.e. finalize or "link") the jump points when the label address becomes known. My "assemblers" often have constructs like:
where L? registers a symbolic jump destination (i.e. x.label[0].offset = label_offset) and J? emits an unconditional jump and registers a link request (i.e. push(x.label[1].from, jump_opcode_offset)). When a block is finished all the offsets are known; you just process things like Knowing when to emit symbolic label and jump instructions from the AST is a little more involved, but no more than analyzing the AST for anything else.Supporting computed gotos would require much more bookkeeping, I'd imagine, and I'm not surprised few languages support that construct. Or maybe not... I haven't really thought it through.
One cool thing about this whole exercise is that it helps to demonstrate why generating some intermediate representation can be easier (conceptually and mechanically) than directly generating runnable code in a single pass. It seems more complex but it really makes things easier.