Hacker News new | ask | show | jobs
by salamo 737 days ago
When I started out I was basically stumbling around for code that worked. Things got a lot easier for me once I sat down and actually understood broadcasting.

The rules are: 1) scalars always broadcast, 2) if one vector has fewer dimensions, left pad it with 1s and 3) starting from the right, check dimension compatibility, where compatibility means the dimensions are equal or one of them is 1. Example: np.ones((2,3,1)) * np.ones((1,4)) = np.ones((2,3,4))

Once your dimensions are correct, it's a lot easier to reason your way through a problem, similar to how basic dimensional analysis in physics can verify your answer makes some sense.

(I would disable broadcasting if I could, since it has caused way too many silent bugs in my experience. JAX can, but I don't feel like learning another library to do this.)

Once I understood broadcasting, it was a lot easier to practice vectorizing basic algorithms.

3 comments

The broadcasting doc is surprisingly readable and easy to follow. And the rules are surprisingly simple. The diagrams and examples are excellent. https://numpy.org/doc/stable/user/basics.broadcasting.html

After taking the time to work through that doc and ponder some real-world examples, I went from being very confused by broadcasting to employing intermediate broadcasting techniques in a matter of weeks. Writing out your array dimensions in the same style of their examples (either in a text file or on a notepad) is the key technique IMO:

  Image  (3d array): 256 x 256 x 3
  Scale  (1d array):             3
  Result (3d array): 256 x 256 x 3
And of course with practice you can do it in your head.
Please please don't put that in your head or your notepad. This is what code comments are for!
You definitely should "annotate" arrays with their expected sizes, but if you're doing a lot of broadcasting operations it can get pretty verbose to write out those tables over and over.

That said, yes, you definitely should at least make an attempt to clarify your broadcasting logic if you want to be able to read your own scripts in a month from now, let alone write maintainable production code.

Explicit internal broadcasting (by adding an axis with an explicit `indefinite` size instead of `1`) would be so much simpler to reason about.

Unfortunately there is far too much existing code and python is not type-safe.

You can do this with `np.newaxis` - in the NumPy course I wrote as TA we required students to always be explicit about the axes (also in e.g. sums). It would be nice if you could disable the implicit broadcasting, but as you mention that would break so much code
`np.newaxis` explicitly adds a `1` dimension, not an `indefinite` one.