https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...
The pcmpeqb instruction is from SSE 4, it compares 16 bytes per op