Hacker News new | ask | show | jobs
by jeanmichelx 2647 days ago
So how does this degrade for bigger hash tables? Surely the two loops implementation is less efficient since your cache is trashed by the time you do the second look up
2 comments

This varies not with the size of the whole hash table, but with the distance from any given index to the first empty bucket. By keeping the load factor constant, you can grow the hash table as much as you want without degrading the expected performance.
Good point, thanks!
Not necessarily, as the sibling mentioned load of the table matters, but also the fact that the single loop has to do other housekeeping may also mean that it could ruin any cache prefetching that could be happening because of the less predicable pattern. Ideally the housekeeping variables are going to be in registers but that's not always going to be possible. In this case, two loops that look at the data sequentially are going to predict very very nicely and probably load more of the cache as it's searching than anything that could be random.