|
|
|
|
|
by jart
1652 days ago
|
|
PMOVMSKB/BSF/POPCNT takes serious wizardry, but instructions like PCLMULLQLQDQ make you feel like Gandalf. It's defined: pair clmul(uint64_t a, uint64_t b) {
uint64_t t, x = 0, y = 0;
if (a && b) {
if (bsr(a) < bsr(b)) t = a, a = b, b = t; /* optional */
for (t = 0; b; a <<= 1, b >>= 1) {
if (b & 1) x ^= a, y ^= t;
t = t << 1 | a >> 63;
}
}
return (pair){x, y};
}
There's a famous paper on how it can perform polynomial division at 40gbps. It's really cool that it has practical applications in things like CSV too. https://www.intel.com/content/dam/www/public/us/en/documents... |
|