Hacker News new | ask | show | jobs
by onalark 3924 days ago
Great post, William. I really appreciated your exposition on both the challenges you folks were facing and your solution to the problem.

As a few others have pointed out, sending a lambda function through NumPy is almost always the last thing you want to do. Unfortunately, you were in a situation where you were either going to have to do something really painful like using einsum: https://stackoverflow.com/questions/29989059/matrix-multipli... or writing your own ufunc:https://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-tutor...

I suspect that your primary limitation was the Google Compute Engine infrastructure. I'm not familiar with the limitations there, but a quick search on Google turns up a fairly limited set of libraries indeed.

I thought it would be interesting to adapt your code slightly to use Numba acceleration. Here's what it looks like:

  from numba import jit

  def avg_transform(image):
      m, n, c = image.shape
      for i in range(m):
          xi = image[i]
          for j in range(n):
              avg = xi[j].sum()/3
              xi[j][:] = avg
      return image

  fast_avg_transform = jit(avg_transform, nopython=True)
I observed 25ms per image https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5 processing times on my laptop on 1280x720 pixels.

Re-reading your post, I suspect that einsum might actually be your cup of tea, but I really enjoy the simplicity and performance of using Numba for these sort of tasks.

1 comments

I haven't used Numba -- looks fast and easy!

But am I missing something? Numpy has everything you need already, natively, no? Some slicing or a dot product should get you there... no need for ufuncs or einsum, I think...

  avg = (rgb[...,0]+rgb[...,1]+rgb[...,2]) * (1.0 / 3.0)
or better yet,

  gray = np.dot(rgb, [0.299, 0.587, 0.114])
More generally, for an image im with shape (width, height, channels) and a square transformation matrix M of shape (channels, channels), you can do :

  res = np.dot(im, M.T)
It will work with affine transformation as well if you add a 1 component to every pixel. It will also work with higher dimensional images if I'm not mistaken.
Agreed, for some reason when I was looking at this last night I thought I couldn't use broadcasting, I've added the example to the gist: https://gist.github.com/ahmadia/c1f8be119f3cb2d2b8e5

Would you believe that Numba is 4 times faster for the sort of simple transformations described in the blog post?

(See aktiur's response below, some performance gains come from avoiding a copy)

Numba is indeed pretty impressive, but you're not comparing exactly the same thing with this code.

In the Numba case, you're basically modifying the image in place: it means no allocating a new array, no full copying. However, your pure-numpy code basically creates a new array (the result of np.dot) before copying it back entirely in image.

If you write the two functions so that they both return a new numpy array and do not touch the original one, the time difference drops from 4 times faster to 2.5 times faster. That's still an impressive difference, but at the loss of a bit of flexibility.

https://gist.github.com/aktiur/e1cddee8f699ded49824

N.B.: numpy.dot does not use broadcasting, i.e. it does not allocate a temporary array to extend the smaller one. The function handles n-dimensional arrays by summing on the last index of the first array, and on the second last of the second array.

Thanks, I clearly wasn't being careful. I'll update my Gist...

edit: On reviewing, I think the intent of the original blog post was to modify images in place (or at least to do it as quickly as possible with in-place filtering ok). In that case, I think my comparison is fair, since NumPy doesn't offer a faster way to do the requested operation. I didn't try out einsum, but I think Numba would outperform that as well.