|
|
|
|
|
by fluffything
2424 days ago
|
|
I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity. The RFC about that is https://github.com/rust-lang/rfcs/pull/2686 , where you see users kind of split into the "I want faster binaries" and "I want more deterministic execution" camps. Neither are wrong TBH. Some people have tried to show there that enabling FP-contraction by default isn't always better / more precise, but I'm not sure if they succeeded. |
|
I think associativity is necessary to vectorize reduction operations like:
I haven't looked at the code generated by ICC, but I would expect it to vectorize this by computing tuples of "partial sums", roughly as follows: and then doing a horizontal sum r = r0 + r1 + r2 + ... in the end. But this requires associativity. (And commutativity, but that's a given.)