Hacker News new | ask | show | jobs
by Tojot 3576 days ago
It so happens that a large part of my PhD was on this very subject. The result I've got N log(N), this is more visible when you get to larger RAM (I had 0,5 TB RAM at the time). We have an empirical result, a justification and a rigorous predictive model.

The reason has to do with hashing, but a different type: TLB.

I posted more details as https://news.ycombinator.com/item?id=12385458

1 comments

Thanks, I came across your research before and thought it was quite cool. In Section 8.5, when discussing whether hash tables would be suitable for handling TLB misses, wouldn't denial of service attacks also be a concern? If an attacker knows the hash function and can control the addresses being looked up, they might be able to trigger worst-case behaviour on every lookup, couldn't they?
That's addressed in that very section:

> Moreover, an adversary can try to increase a number of necessary rehashes.

It seems to me that the section is a bit too dismissive, though, as there are hash tables and hash functions that mitigate these concerns. In particular, collisions can be replaced with trees, like in Java, limiting the worst case to O(log n) again.

Rehashes and bad probing behaviour aren't quite the same thing, but I'll let it count and admit I may have replied too quickly ;)