|
|
|
|
|
by crote
375 days ago
|
|
You've got to keep in mind that computers aren't the 1-instruction-at-a-time purely sequential machines anymore. Let's say you've got a linear array of bytes, and you want to see if it contains a specific value. What would a modern CPU need to execute? Well, we can actually compare 64 values at a time with _mm512_mask_cmp_epu8_mask! You still need a little bit of setup and a final "did any of them match" check, of course. Want to compare 512 values? You can probably do that in less than 10 clock cycles with modern machines Doing the same with a hash set? Better make sure that hash algorithm is fast. Sure it's O(1), but if calculating the hash takes 20 cycles it doesn't matter. |
|
A string search algorithm that uses SIMD to do quickly discard a majority of 16, 32 or 64 attempts in parallel, and then verify the surviving ones quadratically (again 16, 32 or 64 bytes at a time) can go a very long way against a sublinear algorithm that understands needle structure, but necessarily needs to process the haystack one byte at a time.