Hacker News new | ask | show | jobs
by ruste 1025 days ago
This is probably fantastic from a maintainability perspective, but I'm curious if some performance is left on the table by using LLVM IR instead of compiling directly to machine code. I know there are a number of optimizations that can be made for Fortran that can't be made for C-like languages and I wonder if some of those C-like assumptions are implicitly encoded in the IR.
5 comments

The original author of LFortran. Great question.

We designed LFortran to first "raise" the AST (Abstract Syntax Tree) to ASR (Abstract Semantic Representation). The ASR keeps all the semantics of the original code, but it is otherwise as abstract/simple as possible. Thus by definition it allows us to do any optimization possible, as ASR->ASR optimization pass. We do some already, we will do many more in the future. This optimizes all the things where you need to know the high level information about Fortran. Then once we can't do any more optimizations, we lower to LLVM. If in the future it turns out we need some representation between ASR and LLVM, such as MLIR, we can add it.

We also have a direct ASR->WASM and WASM->x64 machine code, and even direct ASR->machine code, but the ASR->LLVM backend is the most advanced, after that probably our ASR->C backend and after that our ASR->WASM backend.

How does LLVM cope with the array semantics? I was under the impression that the noalias attribute in the IR was not activated in such a way as to enable the optimizations that make Fortran so fast.
My experience with LLVM so far has been that is possible to get maximum speed as long as we generate the correct and clean LLVM IR, and do many of the high level optimizations ourselves.

If LLVM has any downsides, it is that it is hard to run in the browser, so we don't use it for https://dev.lfortran.org/, and that it is slow to compile (both LLVM itself, as well as it makes LFortran slow to compile, compared to our direct WASM/x64 backends). But when it comes to runtime performance of the generated code, LLVM seems very good.

LLVM is awesome if you emit what it expects (basically IR as close to what clang emits).

The further you stray from that, the worse is it.

Rust drove the fixes needed in llvm to support noalias. They went through a couple reverts before seemingly fixing everything. If lfortran emits noalias, llvm can probably handle it now.
> We designed LFortran to first "raise" the AST (Abstract Syntax Tree) to ASR (Abstract Semantic Representation) …

I’ve actually never heard of an ASR before… sounds very interesting! Do you happen to have any resources I can read about this?

WASM -> x64 as in a full WASM AOT compiler? Don't those already exist? What's the benefit to making one specifically for LFortran? Unless I'm misunderstanding

Super cool stuff though

Yes, we could make the WASM->x64 standalone. The main motivation is speed of compilation. We do not do any optimizations, but we want to generate the x64 binary as quickly as possible, with the idea that it would be used in Debug mode, for development. Then for Release mode you would use LLVM, which is slow to compile, but good runtime performance. And since we already have ASR->WASM backend (used for example at https://dev.lfortran.org/), maintaining WASM->x64 is much simpler than ASR->x64 directly.
I think the question was why maintain your own WASM->x64 at all though? Couldn't you just run an existing WASM->x64 tool (using -O0 or equivalent if the goal is quick compilation for debug mode)?
As someone who's only played with Fortran, and never done anything too serious with it, can you explain an optimization that can be done in Fortran that can't be done in a C-like language?

I'm not being argumentative, I'm actually really curious.

A simple example is returning an allocatable array from a function, where the Fortran compiler can decide to allocate on a stack instead, or even inline the function and eliminate completely. While in C the compiler would need to understand the semantics of an allocatable array. If you use raw C pointer and malloc, and use Clang, my understanding is that Clang translates quite directly to LLVM and LLVM is too low level to optimize this out, depending on the details of how you call malloc.

Of course, you can rewrite your C code by hand to generate the same LLVM code from Clang, as LFortran generates for the Fortran code. So in principle I think anything can be done in C, as anything can be done in assembly or machine code. But the advantage of Fortran is that it is higher level, and thus allows you to write code using arrays in a high level way and do not have to do many special things as a programmer, and the compiler can then highly optimize your code. While in C very often you might need to do some of these optimizations by hand as a user.

Fortran has true multidimensional arrays in a way that C doesn't have--if you know an array is 5x3, you know that A[6, 1] doesn't map to a valid element whereas in C, it does map to a valid element. This turns out to make a lot of loop optimizations easier. (Also, being Fortran, you tend to pass around arrays with size information anyways, which C doesn't do, since you typically just get pointers with C).
Is the size info compile-time or runtime in Fortran?
It can be both. If you know the dimension at compile time, it is compile time, if you don't it will be runtime.
I don’t think such an optimization exists.

The nice think about Fortran it that is does the sensible thing by default for the type of scientific computing codes that are inside it’s wheelhouse (the trivial example, it assumes arguments don’t alias by default).

C can beat anything, assuming unlimited effort. Fortran is nice for scientists who want to write pretty good code. Or grad students who are working on dissertations in something other than hand-tuning kernels.

C can be as good as Fortran if make sure to declare pointers "restrict". That is a C feature added in C99 though. For Fortran it has always been the default.

For a long time Fortran was actually unbeatable and C did not suffice to specify all possible uses of assembly...

This is the correct answer. They almost entirely compile to the same machine code for the computationally intensive parts. (Even Julia does that these days.) But the limitations of Fortran prevent a lot of difficult-to-debug C bugs, while not affecting typical scientific and numerical capability.
Here’s a link to a StackOverflow answer that gives a good example: “Is Fortran easier to optimize than C for heavy calculations?” [0]

[0]: https://stackoverflow.com/questions/146159/is-fortran-easier...

The most significant distinction is that dummy arguments in Fortran can generally be assumed by an optimizer to be free of aliasing, when it matters. Modifications to one dummy argument can't change values read from another, or from global data. So a loop like

  subroutine foo(a, b, n)
    integer n
    real a(n), b(n)
    do j = 1, n
      a(j) = 2 * b(j)
    end do
  end
can be vectorized with no concern about what might happen if the `b` array shares any memory with the `a` array. The burden is on the programmer to not associate these dummy arguments on a call with data that violate this requirement.

(This freedom from aliasing doesn't extend to Fortran's POINTER feature, nor does it apply to the ASSOCIATE construct, some compilers notwithstanding.)

> The burden is on the programmer to not associate these dummy arguments on a call with data that violate this requirement.

What happens when the programmer pass aliasing a and b? Will it cause UB, like in C if you violate the restrict keyword?

Fortran's standard doesn't use the term Undefined Behavior; instead, it states a requirement that an object modified via one dummy argument must not be modified or referenced via any other name. When a program violates that requirement, it's no longer conforming to the standard, and the language doesn't define what happens afterwards. In practice, you'll get bad results and a bad day of debugging.
> When a program violates that requirement, it's no longer conforming to the standard, and the language doesn't define what happens afterwards. In practice, you'll get bad results and a bad day of debugging.

This is just UB by another name

That's.. disappointing

Rewind your disappointed expectations back to Fortran II in the 1950's, the first Fortran with subprograms. The value proposition was this: if a programmer was willing to help the Fortran optimizer out by avoiding aliasing, the compiler would generate code that was usually fast enough to avoid the need to write in assembly. It was a great deal for the programmer at the time. Fortran succeeded not just because it allowed one to write more productively in a "high-level" language but because its compilers could optimize well enough that there was no significant performance cost to doing so. (And then its popularity allowed one to port Fortran code from one vendor's hardware to another.)
This can be done in C, but not C++, though in practice all C++ compilers support it. It's the `restrict` keyword
Suppose A and B are fixed length arrays. C = A + B - A is easily optimized in fortran, to a no op if C is not modified afterwards, and to a copy if it is.

In C this is pretty much impossible.

To be fair, there are C like languages (ispc, glsl) which makes this work with heroic compiler efforts.

LFortran is not necessarily using LLVM IR to compile. It's building up an ASR [1] structure that's already being used in the LFortran-specific backends. Potentially it can make full use of Fortran semantics!

[1] https://docs.lfortran.org/en/design/

> there are a number of optimizations that can be made for Fortran that can't be made for C-like languages

That used to be true a long time ago but since the restrict keyword was introduced in C99 it's not really true any more.

As an indirect answer, consider Julia, which is based on LLVM and seems to be competitive with Fortran on large scale, numerically intensive calculations.
Can Julia pre-compile to a binary executable now? If not, they can't replace Fortran.
That is not a true binary. Making julia truly compile into binaries is now the number 1 goal of the language according to Tim Holy and the julia team.
LFortran can translate your Fortran code to Julia via our Julia backend. Once Julia can compile to a binary, it will be exciting to do some comparisons, like speed of compilation and performance of the generated binary. As well as the quality of the Julia code that we generate, we'll be happy to improve it to create canonical Julia code, if at all possible.