|
|
|
|
|
by sifar
1497 days ago
|
|
>> For SIMD, this is a complicated mess for the compiler to unravel this is trivially vectorizable for simd, would fit nicely in a vliw packet too. The only issue is if there was a runtime memory stall with any access, then the entire pipeline would stall. with predication, modern simd even parallelize if conditions like below. int[] x, y, z;
int[] p, d, q; for (int i = 0; i < size; ++i) {
p[i] = x[i] / z[i];
d[i] = z[i] * x[i];
if(i>n) {
q[i] = y[i] + z[i] ;
} else {
q[i] = y[i];
}
}
|
|