| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gnufx 1680 days ago

You will generally want at least -funsafe-math-optimizations for performance-critical loops. Otherwise you won't get vectorization at all with ARM Neon, for instance. You also won't get some simple loops vectorized (like products) or generally(?) loop nest optimizations. You just may not be able to afford the maybe order of magnitude cost if your code is bottlenecked on such things (although HPC code actually may well not be).

In my experience much scientific Fortran code, at least, is OK with something like -ffast-math, at least because it's likely to have been used with ifort at some stage, and even with non-754-compliant hardware if it's old enough. Obviously you should check, though, and perhaps confine such optimizations to where they're needed.

BLIS turned on -funsafe-math-optimizations (if I recall correctly) to provide extra vectorization, and still passed its extensive test suite. (The GEMM implementation is possibly the ultimate loop nest restructuring.)