Hacker News new | ask | show | jobs
by sifar 1497 days ago
>> For SIMD, this is a complicated mess for the compiler to unravel

this is trivially vectorizable for simd, would fit nicely in a vliw packet too. The only issue is if there was a runtime memory stall with any access, then the entire pipeline would stall.

with predication, modern simd even parallelize if conditions like below.

int[] x, y, z; int[] p, d, q;

    for (int i = 0; i < size; ++i) {
       p[i] = x[i] / z[i];
       d[i] = z[i] * x[i];
       if(i>n) {
         q[i] = y[i] + z[i]  ;
       } else {
         q[i] = y[i];
       } 
    }