Hacker News new | ask | show | jobs
by wallunit 3931 days ago
> I wonder if there's a way to remove the extra nops?

Yes, technically it would be possible to remove the code instead injecting NOPs. But then I'd have to adjust the jump targets. However, I don't think these NOPs are too bad. Note that goto jumps directly after the NOP ramp of the label, and the NOPs of the goto itself are never reached. The only scenario where NOPs are actually seen by the interpreter is when the natural code path visits a label.

EDIT: Instead re-assembling the bytecode and adjusting jumps, I went to use JUMP_FORWARD(3) instead 7 NOPs now. Note that there are still 4 NOPs left to fill the gap, these however are never executed as they are skipped by the preceding jump instruction. https://github.com/snoack/python-goto/commit/2b0f5e5069cbb88...

2 comments

Is it certain that JUMP_FORWARD is faster than a few NOPs?
I did run the example from the README, with "%timeit range(0, 1000)" in ipython:

10000 loops, best of 3: 72.9 µs per loop @ CPython 2.7.10 with NOP

10000 loops, best of 3: 77.2 µs per loop @ CPython 2.7.10 with JUMP_FORWARD

10000 loops, best of 3: 106 µs per loop @ CPython 3.5.0 with NOP

10000 loops, best of 3: 106 µs per loop @ CPython 3.5.0 with JUMP_FORWARD

100000 loops, best of 3: 8.6 µs per loop @ PyPy 2.4.0 with NOP

100000 loops, best of 3: 8.7 µs per loop @ PyPy 2.4.0 with JUMP_FORWARD

To my surprise, in fact, JUMP_FORWARD isn't any faster than 7 NOPs. In Python 2.7, JUMP_FORWARD is even slower. So reverted:

https://github.com/snoack/python-goto/commit/d19d244a9e5efdf...

Thanks for the pointer!

D'oh. Yes, of course.

It'd be interesting to know whether peculiar basic block graphs upset PyPy's JIT. Apparently Lua's JIT is much less good at optimising programs that don't look like normal Lua.