Hacker News new | ask | show | jobs
by el_pollo_diablo 371 days ago
> In fact, even state-of-art compilers will break language specifications (Clang assumes that all loops without side effects will terminate).

I don't doubt that compilers occasionally break language specs, but in that case Clang is correct, at least for C11 and later. From C11:

> An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

1 comments

C++ says (until the future C++ 26 is published) all loops, but as you noted C itself does not do this, only those "whose controlling expression is not a constant expression".

Thus in C the trivial infinite loop for (;;); is supposed to actually compile to an infinite loop, as it should with Rust's less opaque loop {} -- however LLVM is built by people who don't always remember they're not writing a C++ compiler, so Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language.

> Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language

Worth mentioning that LLVM 12 added first-class support for infinite loops without guaranteed forward progress, allowing this to be fixed: https://github.com/rust-lang/rust/issues/28728

For some context, 12 was released in April 2021. LLVM is now on 20 -- the versions have really accelerated in recent years.
At least it's not just clownish version acceleration. They decided they wanted versions to increase faster somewhere around 2017-2018 (4.xx), and the version increase is more or less linear before and after that time, just at different slopes.
And it's now a yearly major release, is it not?

Same as with GCC

I can't say I'm happy with "yearly broken backward compatibility," but at least it's predictable.
Looks like two major versions per year for LLVM.
Sure, that sort of language-specific idiosyncrasy must be dealt with in the compiler's front-end. In TFA's C example, consider that their loop

  while (i <= x) {
      // ...
  }
just needs a slight transformation to

  while (1) {
      if (i > x)
          break;
      // ...
  }
and C11's special permission does not apply any more since the controlling expression has become constant.

Analyzes and optimizations in compiler backends often normalize those two loops to a common representation (e.g. control-flow graph) at some point, so whatever treatment that sees them differently must happen early on.

In theory, in practice it depends on the compiler.

It is no accident that there is ongoing discussion that clang should get its own IR, just like it happens with the other frontends, instead of spewing LLVM IR directly into the next phase.