Hacker News new | ask | show | jobs
by kazagistar 3397 days ago
The default hash table has better security against malicious input by using a slower hashing algorithm. Its the right default, but if you really want performance and to compete with C/C++, you have to use an algorithm that makes different tradeoffs, or you would be comparing apples to oranges.
3 comments

> or you would be comparing apples to oranges

Do the other programs use that same hash table algorithm?

http://www.sebastiansylvan.com/post/robin-hood-hashing-shoul...

Do the hash tables in use by the C/C++ entries use low-security hash functions? Seems like that needs evidence.
The first-place Rust program uses this very simple low-security hash function:

    impl Hasher for NaiveHasher {
        fn write_u64(&mut self, i: u64) {
            self.0 = i ^ i >> 7;
        }
    }
The second-place C program uses the exact same hash function as the Rust program, except it also truncates the result to 32 bits:

    #define CUSTOM_HASH_FUNCTION(key) (khint32_t)((key) ^ (key)>>7)
The third-place C++ program uses the identity function as its hash function:

    struct hash{
        uint64_t operator()(const T& t)const{ return t.data; }
    };
Sources:

- Rust: http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

- C: http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

- C++: http://benchmarksgame.alioth.debian.org/u64q/program.php?tes...

> The default hash table has better security against malicious input by using a slower hashing algorithm.

I think an adaptive hash that switches from fast to secure when collisions are detected would make a better default choice (at the cost of some implementation complexity).

Or possibly even an implementation with log(n) worst case complexity.