|
|
|
|
|
by tom_mellior
2424 days ago
|
|
> I think the key issue with the optimizations that ICC is performing for C++ but Rust is not doing in this case is just FP-contraction, which is related to, but not the same as, assuming associativity. I think associativity is necessary to vectorize reduction operations like: r+=(c[i]-a[i]*b[i])*a[i]*(c[i]-a[i]*b[i]);
I haven't looked at the code generated by ICC, but I would expect it to vectorize this by computing tuples of "partial sums", roughly as follows: r0 += (c[i+0]-a[i+0]*b[i+0])*a[i+0]*(c[i+0]-a[i+0]*b[i+0]);
r1 += (c[i+1]-a[i+1]*b[i+1])*a[i+1]*(c[i+1]-a[i+1]*b[i+1]);
r2 += (c[i+2]-a[i+2]*b[i+2])*a[i+2]*(c[i+2]-a[i+2]*b[i+2]);
...
and then doing a horizontal sum r = r0 + r1 + r2 + ... in the end. But this requires associativity. (And commutativity, but that's a given.) |
|