It sometimes can, but you then have to balance the time spent optimizing against the time spent actually doing whatever you were optimizing.
Also on modern chips you must wait quite a number of cycles before executing modified code or endure a catastrophic performance hit. This is ok for loops and stuff, but makes a lot of the really clever stuff pointless.
The debuggers software breakpoints _are_ self-modifying code :)
I used GNU lightning library once for such optimisation. I think it was ICFPC 2006 task. I had to write an interpreter for virtual machine. Naive approach worked but was slow, so I decided to speed it up a bit using JIT. It wasn't a 100% JIT, I think I just implemented it for loops but it was enough to tremendously speed it up.
Programs from the 80s-90s are likely to have such tricks. I have done something similar to "hardcode" semi-constants like frame sizes and quantisers in critical loops related to audio and video decompression, and the performance gain is indeed measurable.
Say you set a value for some reason. Later you have to check IF it is set. If the condition needs to be checked many times you replace it with the code (rather than set a value to check some place). If you need to check if something is still true repeatedly you replace the condition check with no-ops when it isn't true.
Also funny are insanely large loop unrolls with hard coded valued. You could make a kind of rainbow table of those.
You mean you somehow avoided a load. But what if the constant was already placed in a register ? Also how could you pinpoint the reference to your constant in the machine code ? I'm quite profane about all this.
> Also how could you pinpoint the reference to your constant in the machine code?
Not OP, but often one uses an easily identifiable dummy pattern like 0xC0DECA57 or 0xDEADBEEF which can be substituted without also messing up the machine code.
If you’re willing to parse object files (a much easier proposition for ELF than for just about anything else), another option is to have the source code mention the constants as addresses of external symbols, then parse the relocations in the compiled object. Unfortunately, I’ve been unable to figure out a reliable recipe to get a C compiler to emit absolute relocations in position-independent code, even after restricting myself to GCC and Clang for x86 Linux; in some configurations it works and in others you (rather pointlessly) get a PC-relative one followed by an add.
If you are generating or modifying code at runtime then how is that different from bytecode? Standardised bytecodes and JITs are just an organised way of doing the same thing.
LuaJIT has a wonderful dynamic code generation system in the form of the DynASM[1] library. You can use it separately from LuaJIT for dynamic runtime code generation to create machine code optimized for a particular problem.
Also on modern chips you must wait quite a number of cycles before executing modified code or endure a catastrophic performance hit. This is ok for loops and stuff, but makes a lot of the really clever stuff pointless.
The debuggers software breakpoints _are_ self-modifying code :)