Hacker News new | ask | show | jobs
by Hankenstein2 1918 days ago
I work at one of the labs mentioned and get paid for running not only the climate models but mesoscale models as well, which are also written in Fortran.

The premise of the article is that Fortran, 70 years later is still an appropriate tool to use for crunching numbers which it absolutely is but it neglects one major problem.

Like the COBOL issue that was all the rage 20 years ago, it is difficult to hire younger generation programmers that want to and are excited to develop in Fortran.

4 comments

> I work at one of the labs...

> ...it is difficult to hire younger generation programmers that want to and are excited to develop in Fortran.

How much are you paying? Most often times I see this kind of reasoning, digging deeper shows that the salaries are not competitive. There's a large number of us that just want to work on interesting problems for adequate money and don't care what the toolset is. I'm fully on board with the idea of being paid to write Fortran.

Also, COBOL's problem isn't so much that younger generations aren't excited about it, but that the problems in the domain solved by COBOL all require highly specialized domain knowledge about an obtuse set of systems said code runs on (with most of their documentation paywalled, at least until recently). The barriers to entry are much, much higher and few companies are willing to train at the rates the language demands.

Let's just say labs pay way better than universities.
For sure, but I mean compared to industry.
My understanding is that they're mostly fortran programs linked together with unix scripts which are run on HPCs - could the models run in a more distributed way like high quality grid computing setup? Lastly, what's the best way to find and learn more about the models?
Switching to any sort of commercial grid or cloud computing setup would be rather complicated by the fact that climate models are critically dependent on the fast, low-latency interconnects (e.g., infiniband) of a proper HPC system to achieve good performance at scale. This is usually coordinated with hand-written message passing via MPI directly in the relevant top-level Fortran (or C/++) program.

There are some other (i.e, “embarrassingly parallel”) scientific computing problems where a higher-latency distributed setup would be fine, but in climate models, as in any finite-element model, each grid cell needs to be able to “talk to” its neighbors at each timestep, leading to quite a lot of inter-process communication.

Yes, they run in the cloud, see e.g. https://cloudrun.co (disclaimer: my side-business), but others have done it as well, for a few years now. On dedicated, shared-memory nodes, it's no different from HPC performance-wise. It can be even better because cloud instances tend to have later generation CPUs, whereas large HPC systems are typically updated every ~5 years or so. But for distributed-memory parallel runs (multi-nodes), latency increases considerably on commodity clouds which kills parallel scaling for models. Fortunately, major providers (AWS, GCP, Azure) have recently started offering low-latency interconnects for some of their VMs, so this problem will soon go away as well.
Indeed, basically, though you may lose from lack of direct access to the hardware. But it's typically expensive. Do AWS and GCP actually have RDMA fabrics now? The AWS "low latency" one of a year or so ago had a similar latency to what I got with 1GbE at one time.
Difficult to run true HPC software like this as a 'grid'. High speed, low latency communication (with MPI) is required.
I was part of a project looking at the feasibility of migrating some of the EPA's air pollutant exposure models from Fortran to R/Python. While Fortran was decisively faster, I think the project lead recommended migrating the model to R since not many people used Fortran anymore. It was also harder to share Fortran code for collaboration as well.
I wrote an addon for the MBS application Simpack[1] in Fortran as part of my master thesis and I have to say except for the stupid line length limit I enjoyed using Fortran (was first contact with Fortran then). My educational background is Mechatronics, so my cs background is not web/gui applications, but rather embedded systems.

[1] www.simpack.com