Hacker News new | ask | show | jobs
by continuational 3603 days ago
It's a common misconception that hashmaps have constant time lookups. In order for the lookup to be constant time, the hash function must output enough bits to avoid causing too many collisions. For example, 16 bits is too little for a hash map with a million keys. In fact, the hash function output size should be O(log(n)) where n is the number of entries. Processing log(n) bits takes at least O(log(n)) time. QED
1 comments

> Processing log(n) bits takes at least O(log(n)) time.

No, because log(n) might be smaller than the word size of the machine, which is the common case for hash table implementations.

If you limit log(n) and thus n to a constant, like the word size of the machine, then every algorithm is constant time.
No, I am not forcing all n of any size to a constant, I'm saying that we can bottom out inductively for small n.

This is a very old discussion. Read Knuth's rationale. Then read the similar sections in Segwick et all or whatever other basic competence textbook on algorithms you like. Compare them, think about which contexts each approach is most useful in.

In other words, no, you have not somehow cleverly invalidated all of complexity analysis.

When log(n) is within the word size the hashmap lookup for the most part is actually constant i.e., few specific machine operation. A generic log(n) algorithm still requires log(n) distinct operation for the most part however small that number is.