|
What he refers to is that you have to ask for them explicitly: import numpy as np
arr = np.array([1, 3, 7, 5, 4, 3, 1, .100])
maxval = arr.max() # 1st pass
minval = arr.min() # 2nd pass
Whereas with numba you'd have something like this: from numba import jit
@jit
def maxmin(arr):
maxval, minval = arr[0]
for e in arr:
if e < minval: minval = e
if e > maxval: maxval = e
return minval, maxval
And that will get optimized to numpy-like speeds, but with a single pass over data. So for large arrays, you'll get about 2x speedup, since memory access is the bottleneck.As for optimizing this use case for NumPy, I'd go for a cythonized maxmin() function. Which is pretty much the same numba does, but you're moving the compilation overhead from the JIT into the compiling step of the module. |
Regarding the moving the calculation, yeah. I get that. I argue that the compiling step of the module happens once for the module, no? The JIT will be something you force onto every execution. Right?
And none of this actually means this library shouldn't have been made. Just that it is a poor example for why it is better.