|
|
|
|
|
by jbapple
3312 days ago
|
|
> do you mean the AVX2 'gather' type instructions? Yes, and AVX512, with two caveats: 1. This is processor-dependent. Different processors have different ratios of instruction latency. 2. Sometimes a related instruction is faster than the instruction specifically for that purpose. For instance, on the glibc on my machine, memcmp and strncmp are NOT implemented using the sse4.2 instructions for string comparison, but instead use ptest and pcmpeq, respectively, because it is faster to do so. The same phenomenon could be true of gathers as well. |
|