Hacker News new | ask | show | jobs
by makmanalp 3397 days ago
Is there something fishy going on with the NaiveHashMap here? What's happening?

  impl Hasher for NaiveHasher {
      fn finish(&self) -> u64 {
          self.0
      }
      fn write(&mut self, _: &[u8]) {
          unimplemented!()
      }
      fn write_u64(&mut self, i: u64) {
          self.0 = i ^ i >> 7;
      }
  }
3 comments

NaiveHasher (declared as "struct NaiveHasher(u64);") is a tuple with one 64-bit field named "0". (Tuple fields are implitly named as 0, 1, 2 ...)

It implements a very simple hashing function for 64-bit numbers, each hashed number n overwrites the state with n xor (n >> 7). finish() will return the last written state.

This seems to work okay since the hashed structure "Code" contains exactly one 64-bit field.

I don't know if it's fishy, but it's certainly very custom. For a generic hashing implementation you'd at least want to mix in the previous state. I assume it was done more for brevity than performance though.

Looks like it only handles u64's being hashed and panics on all other input. That must be because it is known it will be used exactly with `.write_u64`. Not having to implement the general byte buffer hashing is a big benefit, removes the whole partial state tracking of the hasher, which the compiler probably would not optimize out (we don't know until we try though).
It's the Rust equivalent of the CUSTOM_HASH_FUNCTION macro in the C source. The comment from there might help explain things:

    // Define a custom hash function to use instead of khash's default hash
    // function. This custom hash function uses a simpler bit shift and XOR which
    // results in several percent faster performance compared to when khash's
    // default hash function is used.
    #define CUSTOM_HASH_FUNCTION(key) (khint32_t)((key) ^ (key)>>7)