|
|
|
|
|
by minimax
3851 days ago
|
|
According to the Stephan Brumme website you linked to, the slice-by-8 lookup table is 8K and the slice-by-16 table is 16K, so your combo version of crc32 needs 24K of L1 cache to run at full speed. Modern server class CPUs typically have 32K of L1 dcache so that doesn't leave much room for the rest of your work. Maybe that's reasonable (I don't really know what Cyrus does), but I thought it was worth thinking about. |
|
Also, our oldest CPU on production servers seems to be the E5520 right now, which has 128kb of L1 data cache.