|
|
|
|
|
by miscmask
1876 days ago
|
|
> and separately the y,z,w triples are contiguous ([y1,z1,w1,y2,z2,w2,y3,z3,w3...]). Wouldn't having ys, zs and ws occupying different cache lines be good enough? After all, the CPU only wants fast access to these data. Or maybe it's a hard thing to do for CPUs to fetch from 4 different lines at a time (+1 for the instructions)? |
|