|
|
|
|
|
by pcwalton
2755 days ago
|
|
> In my experience, loop unrolling should basically never be done except in extremely degenerate cases Not true. Like many such optimizations, loop unrolling can be useful because it makes downstream loads constant. For example: float identity[4][4];
for (unsigned y = 0; y < 4; y++)
for (unsigned x = 0; x < 4; x++)
identity[y][x] = y == x ? 1 : 0;
... do some matrix math ...
In this case, the compiler probably wants to unroll the loops so that it can straightforwardly forward the constant matrix entries directly to the matrix arithmetic. It'll likely be able to eliminate lots of operations that way.(You might ask "who would write this code?" As Schemers say: "macros do.") See LLVM's heuristics: http://llvm.org/doxygen/LoopUnrollPass_8cpp.html#ad7c38776d7... |
|
To expand on this point - in the more prosaic world of C++ - this sort of code comes about all the time in templated code. For example, the above loop you posted might have been found in something like:
``` template <unsigned N, unsigned M> class Matrix { static Matrix Identity() { ... } }
```The other major source of these sorts of constants leading to DCE oppportunities is inlining. Consider a more classical, matrix implementation that is not templated and doesn't lift its dimensions into the type:
``` class Matrix { unsigned n; unsigned m; static Matrix Identity(unsigned n, unsigned m) { ... } }
```Here, the inlining of the call to `Identity` at the call-site will turn the `n` and `m` in the body of `Identity` into the constant 4.
If I had to make an educated guess - inlining typically generates these (i.e. partial evaluation, constant folding, an DCE) situations most often in compilers. An incredible amount of information can flow from caller to callee when you specialize the callee for that call-site.