Hacker News new | ask | show | jobs
by zahlman 674 days ago
> but I maintain they are still better than gargantuan "frameworks" or everything-but-the-kitching-sink "util"/"commons" packages, where you end up only using a tiny fraction of the functionality but have to deal with the maintenance cost and attack surface of the whole thing.

Indeed. Several toy projects I've done were blown up in size by four orders of magnitude because of Numpy.

I only want multi-dimensional arrays that support reshaping and basic element-wise arithmetic, maybe matrix multiplication; I'm not even that concerned about performance.

But I have to pay for countless numerical algorithms I've never even heard of provided by decades-old C and/or FORTRAN projects, plus even more higher-math concepts implemented in Python, Numpy's extensive (and fragmented - there's even compiled code for testing that's outside of any test folders) test suite that I'll never run myself, a bunch of backwards-compatibility hacks completely irrelevant to my use case, a python-to-fortran interface wrapper generator, a vendored copy of distutils even in the wheel, over 3MiB of .so files for random number generators, a bunch of C header files...

[Edit: ... and if I distribute an application, my users have to pay for all of that, too. They won't use those pieces either; and the likelihood that they can install my application into a venv that already includes NumPy is pretty low.]

I know it's fashionable to complain about dependency hell, but modularity really is a good thing. By my estimates, the total bandwidth used daily to download copies of NumPy from PyPI is on par with that used to stream the Baby Shark video from YouTube - assuming it's always viewed in 1080p. (Sources: yt-dlp info for file size; History for the Wikipedia article on most popular YouTube videos; pypistats.org for package download counts; the wheel I downloaded.)