|
|
|
|
|
by vanderZwan
3397 days ago
|
|
Because matching the native 32 word size is better for the prefetcher, right? Wouldn't most CPU's these days be smart enough to detect advancing the index by 4, and then using offsets? So: for(let i = 0; i < someUint8Array.length; i += 4){
let R = someUint8Array[i], G = someUint8Array[i+1],
B = someUint8Array[i+2], A = someUint8Array[i+3];
// ... manipulations here
}
|
|
I have a feeling it's more of a JS JIT thing than a CPU prefetcher thing, but honestly I'm not really sure.
In my program I linked above, it was actually faster to use Uint32array everywhere and then use functions to pull the 4 color values from it and another function to push the 4 values back to a uint32.
Granted, it's been over a year since I last benchmarked that code, but I did reuse some of the image code recently and found iterating over a Uint32array to be significantly faster. (And funnily enough, manually unrolling the loop of Uint32array to something similar to what you wrote gave an additional small performance boost, but it was small enough to be not worth the extra weirdness in the code to me)