Hacker News new | ask | show | jobs
by kwertzzz 2983 days ago
I am giving a class on numerical methods and one student choose to use python (while my example code was in Julia). It was a pain to see how this student was constantly shooting himself in the foot due some particular behaviors of python and numpy. For instance the student did not expect that a list comprehension iterating over a numpy vector returns just a python vector. Also the fact that the index ranges the last value is excluded let to several bugs. The exercise involved a time dependent matrix and the student choosed to represent it as a 3d Array, but then he needed to constantly convert slices as a matrix to use matrix multiplication (maybe this is now better solved with python 3 and the @ operator). So in short for doing linear algebra, Julia is really more convenient to use.
1 comments

This comment just sounds like "I know tool X well, somebody else doesn't know tool Y well, and therefore tool X is better."

To do matrix multiplication on many matrices "stacked" together in one step, use numpy.matmul: https://docs.scipy.org/doc/numpy/reference/generated/numpy.m... (and so there's no need to slice up the array, convert to matrix, etc.)

Note that the Numpy devs are trying to (if they haven't already) get rid of the "matrix" class and just use arrays, but of course dealing with legacy code is always an issue. Once that's out of the way, people won't be distracted by "matrices" to do matrix operations, and hopefully they'll see you can do matrix operations on arrays directly. (And yes, in Python 3 you can use the @ syntax to the same effect.)

Thank you for the info about numpy.matmul. However this feels quite clumsy to use a function call for a matrix operator, especially when you have several chained. I think that this is an good move to get rid of matrix class (I just didn't find any depreciation info in the documentation, maybe this still needs some time).

I think the python language is very elegant and the principle that there should be only one obvious way to do something has served them quite well. Unfortunately, in numpy you have quite often multiple ways to do things (often in addition to python own mechanism, e.g sum, numpy.sum and the sum method). I deal with students who have little programming experience and this can be confusing. One of the reasons I choose Julia for my lectures was that these issues do not exist in Julia. Julia is quite clean and simple in this regard.

However I completely see that for an experienced programmer (or a scientist with good programming experience), this is not a problem.

But for somebody learning to solve numerical problems, it is quite helpful that Julia code tends to be closer to the mathematical formulation. In addition, for the test I made, Julia code tends to be faster than vectorized numpy code (I can share the code if there is interest). The only major argument against Julia, in my opinion, is that it is still a young language and with a small ecosystem (much smaller in fact than python or R)

I would be curious about the code you use. Numpy was natural for me after going through engineering school, where Matlab was taught from year 2 on. Again, that was a language much more focused on the numerics. But as soon as I had to do something that wasn't numerical (first job out of school, and for everything since), I learned to hate Matlab and love Python.

Anyhow, that experience surely doesn't map onto Julia, a completely different language. So I'd be curious to see what your use case is; it might give me a different perspective on Julia (which I have only played with a couple of times back when it was even younger).

Sorry for the delay, but here is an example code:

https://gist.github.com/Alexander-Barth/c8eb764f400cdb7a1eb5...

Do not hesitate to tell me if I missed something to optimize the python code. If somebody has numba, pythan,... installed, I would be interested to see the speed-up compared to the vanilla python version on your machine.

So in short, for my cases: the fastest Julia test case (with loops and avoiding unnecessary allocation) was about 10x faster than fastest python 3 test case (with vectorization).

The runtime with vectorization are relatively similar (julia is only about 25% faster than python). Explicit loop and careful memory management are clearly beneficial in Julia.

The speed difference compounds as code grows in complexity. Julia compiles things together, using interprocedural optimizations and automatic type specialization. Lots of allocation saving utilities, fancy performance macros, etc. let you get very fast code. Recent testing in large applications showed that using a Numba function in a Julia code or a Python code was about 10x slower than a Julia code (here's a quick writeup: http://juliadiffeq.org/2018/04/30/Jupyter.html). We had a crew that was more experienced in Python keep trying to make it better (and lots of Julia programmers come from Python and have more experience (many more years!) with Python). 10x seemed to be the amount that Pythran, Cython, Numba was behind defining a Julia-function using pyjulia (Numba was great and easiest in comparison, so our docs kept that and dumped the others).

Moral of the story is, these Python tools are built for microbenchmarks and can do okay there, but without the full stack optimized together and without a type system that's exploitable for all of the performance tricks, it falls apart in real-world code.