Hacker News new | ask | show | jobs
by rented_mule 21 days ago
numexpr is for different use cases than Polars or Taichi, which themselves are quite different from each other. numexpr is more akin to numba - it speeds up numpy usage.

numexpr speeds up different kinds of usage than numba does. numba is best at speeding up non-vectorizable usage of numpy like repeated operations on arrays inside of for-loops. numexpr speeds up regular numpy expressions, like `5 * x + 7` where x is an array, by avoiding intermediate allocations. It calculates the entire expression for each cell, rather than doing each individual operation into intermediate arrays. It uses strings for expressing calculation so that Python will not break down the expression and hand it off to numexpr one operator at a time like it does with numpy.

1 comments

I understand what numexpr does, I don't understand why I'd use it.

Polars is able to lazy evaluate query plans without any unnecessary intermediate allocations, if I want to do algebra on dataframes, I'd use polars.

The narrow usecase seems to be that you have large matrices such that memory efficiency is a concern, but not so large that they don't fit into memory at all.

My point was that this seems like a very narrow niche to me, where I'd still rather use numba or taichi purely because I don't have to evaluate raw strings and can still rely on linters.

I don't know how you'd define niche, but there are many applications where multidimensional arrays of a uniform data type are needed and data frames are not. Image processing would be a simple example where you might have 2D arrays of floating point brightness values that you want to do operations on. Medical imaging is often 3 or 4 dimensions (spatial + time). Analysis of a spatially arranged set of sensor readings over time can easily be more than that. The increase in speed of deferred evaluation is nice in applications like this, especially given that very little change is needed.

You're right that it has trade-offs, like challenges with linting. But many practitioners in these domains are experts in the area of science or engineering involved, not in software development. The ease of adapting an existing script is a big deal for many of them. Many don't even know what a linter is, and numexpr predates (by several years) the high quality linters like ruff that so many of us rely on today.