Hacker News new | ask | show | jobs
by dmpk2k 5069 days ago
No, you did not. If you had, there is absolutely no way you could have written:

"optimal is a binary search tree in an array, like the way heaps are implemented."

I leave it to other people to read the article first, then your comment, then vote accordingly. For shame.

2 comments

If I'm wrong, enlighten me.

Thus far all you've got is ad hominem arguments.

Anyways, I didn't come here to show off, but I did implement a b+ tree that can do > 1 million random lookups/second on a single core, on a > 100 mb tree. So believe it or not, I might know what I'm talking about.

TL;DR - If your element sizes are powers of two, and your cache is set-associative, you run the risk of set aliasing. Caches always use lower bits of the address as the key, probably because it's really fast.

If you have more aliases than there are ways in each set, items will become evicted from the cache. If you do it just right, you can end up with the worst case, and you get 100% cache misses. Therefore even something that might on the face of it seem perfectly cacheable (well... if you don't know quite how the cache works), with a small working set (note that the test involves repeating the same search over and over again), can actually end up suffering from 100% cache misses.

The L2 and L3 caches use physical addresses rather than virtual ones, so you can't fully control this effect from user mode.

(Fingers crossed for my terminology...)

Yeah, I get that. What I was getting at is that binary searches are terrible in general are _terrible_ as soon as you're not hitting cache anymore - and for any problem worth optimizing that's what happens, if nothing else because your data set doesn't fit in l2 anymore.

Worrying about stuff not being cached because of associativity is pointless when the last 5-10 levels aren't going to be in cache anyways because they don't fit; those last 5-10 cache misses are going to _utterly dominate_ your search time.

IOW, even if associativity isn't a problem at all, binary searches suck w.r.t. caches - I'm not at all saying they're wrong, just that it's not terribly relevant - if performance matters you need to be doing something other than a binary search.

So, a "binary search tree in an array, like the way heaps are implemented" actually would not experience the problem described in this article. The typical layout of the tree for a heap is "breadth-first, level-order" as opposed to the "depth-first, in-order" tree that is implicit to a sorted array. This means that your accesses to the first couple levels are in the same cache line, and your subsequence accesses as you move down the tree are less likely to line up. This is definitely not to say this is "optimal", however.