| Here's how I would write this, in Cython using "pure C" arrays: https://gist.github.com/syllog1sm/3dd24cc8b0ad925325e1 It's getting 18,000 steps/second, in the same ballpark as your C code. I prefer to "write C in Cython", because I find it easier to read than the numpy code. This may be my bias, though --- I've been writing almost nothing but Cython for about two years now. Btw, if anyone's interested, "cymem" is a small library I have on pip. It's used to tie memory to a Python object's lifetime. All it does is remember what addresses it gave out, and when your Pool is garbage collected, it frees the memory. Edit: GH fork, with code to compile and run the Cython version: https://github.com/syllog1sm/python-numpy-c-extension-exampl... . I hacked his script quickly. |