Hacker News new | ask | show | jobs
by enriquto 1877 days ago
> do sparse matrix computations on massive matrices

This is completely impossible to do in the Python language, unless you resort to external tooling written in C or Fortran. Sure, you can call these codes from Python, as you can call them from any other language.

4 comments

Nobody cares about what you're saying though (with respect) in this area. It's all about the ecosystem or Anaconda distribution itself rather than just the core language. I agree that what you're saying is accurate, but it also happens to be irrelevant in this particular case.

Numerical methods and data science are mostly done by engineers, mathematicians, and other random stem folks. I've yet to meet someone who is even cognizant that Numpy is really calling out to some low level C, C++, or Fortran library. They just know that you call a library like any other and the code works.

If you're trying to say that any language with FFI capabilities can do that, you'd be right, but it also doesn't matter much. Python has somehow found a sweet spot where it's easy to learn and onboard people and there is support for a lot of stuff with relatively low hassle. It certainly isn't lisp, but somehow seems to be orders of magnitude more successful.

I've been searching for a tool/language/ecosystem to replace Python for ages, but nothing ends up becoming close. I spent a significant amount of time learning lisp, but a lot of what I saw (besides the power of macros and restarting) was just a less intuitive way of doing things I could easily do in Python, Ruby, or Perl. Lisp is secret alien technology if you're coming from C or C++, but coming from Python it seems closer to a wash.

> I've yet to meet someone who is even cognizant that Numpy is really calling out to some low level C

then you've never met anybody who builds the tools that you use. Which is alright. But if you disparage their point of view then you sound a bit funny.

In this particular niche', yes. But again, that is almost entirely irrelevant to the vast majority of the millions of scientific Python users.

I'd love to write my own solution in assembly or C where I give birth to every function, but nobody has time for that level of monumental effort. Low level matrix libraries have a lot of inertia for a reason.

I'm not disparaging anybody's point of view. Yours is certainly valid for a small group of elite users. I'm just trying to point out that it is only a valid point for a very small group. Most simply view these things from the perspective of the entire ecosystem. Even scientists well aware of the C internals will not always use that knowledge.

The ecosystem matters. I'm a developer and not a scientist, but having everything inside an environment that's at least workable is a huge boon.

Of course you could call the same functions from the ffi of any other language, but nobody does that for the same reason that nobody writes web applications in C.

I hate python, as far as I'm concerned it's a nightmare hell of a language that does everything wrong, and yet it's probably the language I use the most due to its sheer convenience and massive ecosystem.

This is really splitting hairs isn't it? Plenty of languages are not bootstrapped. Isn't that essentially the same thing?
No, nothing to do with bootstraping, this is completely different. My point is that you cannot develop the very algorithms that you are using. Numerical math is not only about using ready-made algorithms, it is mostly about implementing new algorithms. For example, if you invent a new matrix factorization algorithm, it is very likely that you cannot implement it in Python (or if you can, it will be either very slow or very cumbersome). Python+numpy is not a natural way to write many numerical algorithms, based on explicit loops and new conditions inside them. Whereas in Fortran or in C, the implementation is likely to be much simpler, natural and fast.
Nobody is arguing that, but they're saying it doesn't matter to the majority of scientists who just want to invert a matrix for some study and don't need to implement a new matrix inversion algorithm. I would use C or C++ for that most likely. That is a valid use case for some scientists, but I would expect it to be a very small number compared to those that just need to use the existing tools in the ecosystem.

I think we may be speaking past each other a bit.

Every single bit of Python relies on code written in C. What is it that you are trying to refute here?
> Every single bit of Python relies on code written in C.

There's pypy, a jit python interpreter writen entirely in python, and it does not depend in C. It is also much faster than the common interpreter, cpython. Unfortunately it is still not appropriate for numerical computation, as the language itself makes working directly with numbers very cumbersome (and this was the point I wanted to make).