Hacker News new | ask | show | jobs
by kwillets 1460 days ago
MinHash is a CPU hog due to precisely this step -- cheap rehashes are the order of the day.

I don't think xor works, but simple arithmetic, IIRC h_i(key) = i*h(key) + C mod 2^32, is viable and SIMD-friendly. Look up minwise-independent hash functions.