|
|
|
|
|
by kwillets
1460 days ago
|
|
MinHash is a CPU hog due to precisely this step -- cheap rehashes are the order of the day. I don't think xor works, but simple arithmetic, IIRC h_i(key) = i*h(key) + C mod 2^32, is viable and SIMD-friendly. Look up minwise-independent hash functions. |
|