Hacker News new | ask | show | jobs
by 00ajcr 2135 days ago
To broadcast operations (such as addition) between arrays in NumPy, trailing dimensions have to be equal (or be of length 1).

In the example given above the 3D array and 2D array have shape (lengths of dimensions):

   (2, 3, 4)
      (2, 3)
That is - the suffixes do not agree (4 != 3 and 3 != 2) and NumPy raises an error.

However, for the same operation in J the prefixes agree:

   (2, 3, 4)
   (2, 3)
and the addition gives the expected result.

To add the arrays with these shapes in NumPy, one method is transpose each array (reverse order of the dimensions), add these arrays, and then transpose back:

  (a.T + b.T).T
1 comments

Thinking about these as a "verbose imperative language" programmer, I'd say suffix agreement seems to make more sense to me at first glance, and I'd like to hear more about why it might be worse.

For why I think it makes sense: I think of multidimensional arrays as being arrays of arrays, and "normal" index lookup operating on the first dimension. If I have a

    float[100][3]
in some context I might think of it as 100 vec3s, and I might want to do some vec3 operation on each of them. I might want to dot them all with my some other vector, or add them all to some other vector. I almost never have 100 scalars and want to apply one scalar to all elements of the corresponding vec3.

But I guess maybe this is all widely agreed on, and maybe the contentious part is just index order? Like, maybe you'd say "100 vec3s" is actually

    float[3][100]
in which case prefix agreement would make more sense.