Hacker News new | ask | show | jobs
by haberman 4763 days ago
Very nice! I wouldn't have thought to use flex to recognize optimizable patterns. :) It crashes for me on OS X but I'll have to try it on Linux.

You may have tried this already, but where you have:

    mov al, byte [PTR]
Usually you want to write this instead to avoid a partial register stall (also in case there's junk sitting in the register).

    movzx  eax, byte [PTR]
1 comments

Hm, the OSX crash is probably because they have a different ABI from Linux -- Google "osx ia32 abi" for details-- I'm probably violating stack alignment.

Partial register stalls are where you write part of the register, then read the full register:

    mov al, byte [PTR]
    add ebx, eax
But I am emitting this:

    mov al, byte [PTR]
    add byte [PTR+3], al
So there's no stall.

I got jit4 about as far as is reasonable for a single-pass compiler, but I intend to implement a proper BF->IR->ASM compiler to implement some more complex multi-pass optimizations I have in mind.

Yes, I have encountered the problem before that OS X crashes on incorrect stack alignment even where other platforms tolerate it.

Good point about the stall -- I always thought that partial register stall happened at the point that you do the partial write, since the logical contents of the register now depend on its previous contents. I didn't realize that the dependency logic was sophisticated enough to allow the partial read without depending on the entire register's value.