Hacker News new | ask | show | jobs
by aktiur 3928 days ago
Numba is indeed pretty impressive, but you're not comparing exactly the same thing with this code.

In the Numba case, you're basically modifying the image in place: it means no allocating a new array, no full copying. However, your pure-numpy code basically creates a new array (the result of np.dot) before copying it back entirely in image.

If you write the two functions so that they both return a new numpy array and do not touch the original one, the time difference drops from 4 times faster to 2.5 times faster. That's still an impressive difference, but at the loss of a bit of flexibility.

https://gist.github.com/aktiur/e1cddee8f699ded49824

N.B.: numpy.dot does not use broadcasting, i.e. it does not allocate a temporary array to extend the smaller one. The function handles n-dimensional arrays by summing on the last index of the first array, and on the second last of the second array.

1 comments

Thanks, I clearly wasn't being careful. I'll update my Gist...

edit: On reviewing, I think the intent of the original blog post was to modify images in place (or at least to do it as quickly as possible with in-place filtering ok). In that case, I think my comparison is fair, since NumPy doesn't offer a faster way to do the requested operation. I didn't try out einsum, but I think Numba would outperform that as well.