Hacker News new | ask | show | jobs
by mehrdadn 2584 days ago
For multiple passes, is it ever possible that the next pass shrinks the size of the code? Off the top of my head I don't see if this is possible, but I've always wondered this since it would then suggest you could end up bouncing back and forth on every pass, unless explicitly avoid it somehow... can that happen?
1 comments

It could, depending on how you initially encode the forward jmp instruction. If you encode them all as 16-bit int and your assembler optimizes, then it could shrink the code to use 8-bit jmp variant, for example.

The way I got around this issue entirely was to encode all jumps as 16-bit variant. Then, during the 2nd pass I would check for overflow and throw an error and halt, if the jmp was no longer within range. I had a simple type declaration syntax for "byte", "word", and "dword" that you could use to coerce a label to a specific size (and thus, control the instruction variant you want the assembler to use).

So the assembler is actually a simple 2-pass assembler. The 2nd pass locks the code size and errors out if the 1st pass assumptions and types do not hold.

Otherwise, as userbinator mentioned, your problem becomes a backtracking/constraint problem. It's an interesting CS problem, but not so fun when all you want is a working assembler.

edit: Should point out that the initial jmp encoding may be dependent on operand mode (i.e. 16/32/64-bit instruction mode). It's been awhile and I can't remember all the details there, plus my assembler was a few years before 64-bit on the x86 became mainstream, so YMMV.

Thanks!