| HN Mirror

"In-memory" doesn't mean so much as it once did. Faulting a page into cache from RAM takes hundreds or even thousands of cycles. Same for faulting an index page, if the whole index doesn't fit in cache.

A B-tree replaces a log2(N) binary search of the whole range into a K-ary search, with log-base-K(N) probes; but adds searches through the K keys per tree node, which all have to be brought into cache.

Even once the K keys have been brought into cache, a binary search in the index page is quite slow on modern hardware because the iterations involve randomly mispredicted branches.

A great advantage of PGM indexes seems to be that the whole index can be held in (L3) cache. Faulting a line from L3 to L1 cache takes only ~30 cycles. Once the index has been walked, you are close to the desired record and can walk forward or back a few steps to locate it.

If you have to handle insertions, it is often better to keep an overflow table searched first (or last) so you can batch-rebuild during downtime. Deletions may be handled by marking dead entries, and updates are easy. Most multigigabyte tables don't see much churn, relative to their size.