Hacker News new | ask | show | jobs
by MichaelSalib 4673 days ago
Um, people have been doing high performance numeric and scientific computing in python for over 15 years now. The truth is, for a lot of numeric computing, you're better off using python, which calls fortran or c code to do the heavy lifting.

Fortran can run very fast. If all you need to do is numerical calculations, that might be enough. But that's never all you need to do: you need to preprocess your input data, or deserialize it, normalize it, join it or transform it based on some rdbms data, then you do your calculations, and then you need to graph it, serialize it, etc. Most of those tasks are somewhere between excruciating and impossible in fortran, however modern a dialect you use.

3 comments

>Um, people have been doing high performance numeric and scientific computing for over 15 years now.

Have I disputed this fact somewhere?

>The truth is, for a lot of numeric computing, you're better off using python, which calls fortran or c code to do the heavy lifting.

Proof that Python can't get the job done in that particular area. Nevertheless there remain Python zealots who will tell you that it's the best for everything.

>Fortran can run very fast. If all you need to do is numerical calculations, that might be enough.

Actually it sounds like that is all he needs to do. Reading and spitting out a .csv file from Fortran is trivial, despite its clunky IO syntax.

Have I disputed this fact somewhere?

Yes, when you were complaining that "These are the guys trying to cram Python into every possible goddamned use case".

Proof that Python can't get the job done in that particular area.

Python can't run at all without C code. Fortunately, Python interfaces really well with C and C++ and fortran. The fact that Python lets you be way more productive using fortran tools without having to know or suffer from fotran's glaring deficiencies seems like a huge plus to me.

Reading and spitting out a .csv file from Fortran is trivial, despite its clunky IO syntax.

Not really. People who do serious analysis need to make graphs. They need to push data back into SQL databases. They need to do all sorts of interactive analyses. They need to present their results and analytics to other people, including reviewers. They need to debug their analysis. All of that stuff is much easier with python than fortran.

People who do serious analysis need to make graphs

Which is where the aforementioned CSV files come in handy. Heard of GNUplot?

I used to work for a physicist who (still) codes in Fortran. I tried to show him Python. He said, "That's nice, use Python if that's what you like." But he always used fortran. It wasn't usually a big deal to translate between the two. He was giving out the big ideas, we were making use of them. To this day, he is still doing physics, still using fortran, still publishing papers.
> you need to preprocess your input data, or deserialize it, normalize it, join it or transform it based on some rdbms data, then you do your calculations, and then you need to graph it, serialize it, etc.

In large scale number crunching, like climate models, numerical weather prediction, the typical case is that input data is conceptually in a regular 2d or 3d grid, and stored in binary format files (like NetCDF or HDF), as that is more efficient and saves space.

Then the heavy lifting number crunching code runs on a cluster as a batch job, reads in the data, crunches the numbers, and writes results out again in NetCDF or HDF files.

The output files are then downloaded to a desktop PC, and graphing is done with Matlab (Python is also getting more popular) or especially in meteorology with some dedicated meteorology graphing software.

The binary format input and output is probably about as efficient as it can be. Also, heavy number crunching scientist probably don't have much use for relational databases.