Hacker News new | ask | show | jobs
by marcosdumay 1479 days ago
> Given all the talk about OpenMP compatibility and Fortran... my guess is that they're largely running legacy code in Fortran.

The must used linear algebra library is written in Fortran. There's nothing "legacy" about it, it's just that nobody was able to replicate its speed in C.

5 comments

I don't remember the exact specifics, but Fortran disallows some of the constructs that C/C++ struggle with aliasing on, so Fortran can often be (safely) optimized to much higher-performance code because of this limitation/knowledge.

Like, it's always seemed like there's a certain amount of fatalism around Undefined Behavior in C/C++, like this is somehow how it has to be to write fast code but... it's not. You can just declare things as actually forbidden rather than just letting the compiler identify a boo-boo and silently do whatever the hell it wants.

Of course it's not the right tool for every task, I don't think you'd write bit-twiddling microcontroller stuff in fortran, or systems programming. But for the HPC space, and other "scientific" code? Fortran is a good match and very popular despite having an ancient legacy even by C/C++ standards (both have, of course, been updated through time). Little less flexible/general, but that allows less-skilled programmers (scientists are not good programmers) to write fast code without arcane knowledge of the gotchas of C/C++ compiler magic.

> I don't remember the exact specifics, but Fortran disallows some of the constructs that C/C++ struggle with aliasing on, so Fortran can often be (safely) optimized to much higher-performance code because of this limitation/knowledge.

For a crude approximation, Fortran is somewhat equivalent to C code where all pointer function arguments are marked with the restrict keyword.

> Like, it's always seemed like there's a certain amount of fatalism around Undefined Behavior in C/C++, like this is somehow how it has to be to write fast code but... it's not. You can just declare things as actually forbidden rather than just letting the compiler identify a boo-boo and silently do whatever the hell it wants.

Well, it's kind more dangerous than C, in this aspect. The aliasing restriction is a restriction on the Fortran programmer; the compiler or runtime is not required to diagnose it, meaning that the Fortran compiler is allowed to optimize assuming that two pointers don't alias.

That being said, in general I'd say Fortran has less footguns than C or C++, and is thus often a better choice for a domain expert that just wants to crunch numbers.

> The must used linear algebra library is written in Fortran.

My understanding is that most supercomputers have the vendor provide their implementation of BLAS (e.g., if it's Intel-based, you're getting MKL) that's specifically tuned for that hardware. And these implementations stand a decent chance of being written in assembly, not Fortran.

Usually C or Fortran superstructure, and assembly kernels.

The clearest form of this is in BLIS, which is a C framework you can drop your assembly kernel into, and then it makes a BLAS (along with some other stuff) for you. But the idea is also present in OpenBlas.

Lots of this is due to the legacy of gotoBlas (which was forked into OpenBlas, and partially inspired BLIS), written by the somewhat famous (in HPC circles at least) Kazushige Goto. He works at Intel now, so probably they are doing something similar.

BLAS itself has been rewritten in Nvidia CUDA and AMD HIP, and is likely the workhorse in this case. (Remember that Frontier is mostly GPUs and the bulk of code should be GPU compatible)

Presumably that old Fortran code has survived many generations of ports: Connection Machine, DEC Alpha, Intel Itanium, SPARC and finally today's GPU heavy systems. The BLAS layer keeps getting rewritten but otherwise the bulk of the simulators still works.

I think you've made a slightly bigger claim than is necessary, which has lead to a focus on BLAS, which misses the point.

The best BLAS libraries use C and Assembly. This is because BLAS is the de-facto standard interface for Linear Algebra code, and so it is worthwhile to optimize it to an extreme degree (given infinite programmer-hours, C can beat any language, because you can embed assembly in C).

But for those numerical codes which aren't incredibly hand-optimized, Fortran makes nice assumptions, it should be able to optimize the output of a moderately skilled programmer pretty well (hey we aren't all experts, right?).

If you are talking about netlib blas/lapack I am very confused by what you are saying because the fastest blas/lapack implementations are in c/c++.