|
IMO, not using any optimization flags with C is somewhat arbitrary, since the compiler writers could have just decided that by default we'll do thing X, Y, and Z, and then you'd need to turn them off explicitly. FWIW, without -O, with -O, and with -O4, I get 2500ms, 1500ms, and 550ms respectively. I didn't bother to look at the .S to see the code improvements. (Of course, I edited the code to output the results, otherwise, it just optimized out everything.) |
Adding the -ffast-math switch appears to make no difference. I'm never sure what -ffast-math does exactly.
Minimal case on Godbolt:
https://godbolt.org/z/W18YsnMY5 - without the f
https://godbolt.org/z/oc1s8WKeG - with the f