Hacker News new | ask | show | jobs
by zwegner 1644 days ago
CLMUL in general is a bit weird to wrap your head around, but a CLMUL with -1 isn't too tricky: it's like a running 1-bit sum, or in other words, each bit in the result is the parity of all the bits up to that point in the multiplier.

> In our tests, going byte to byte for more iterations to keep the alignment when hitting the "else case" performed worse than making the unaligned loads, but as you say "just use CLMUL" (as all loads will be aligned) :D

I was talking about using bitwise operations with the quote/escape/newline masks already computed (like in the blog post I linked), rather than a byte-by-byte loop. But yeah, CLMUL is better anyways :)