Hacker News new | ask | show | jobs
by lisper 3741 days ago
Oh, I see. It's like a B-tree with a larger node size at each level, yes? But then what's the advantage of basing the node size on primes? Wouldn't you get the same advantages with simpler code by using successive powers of two at each level instead of successive primes?
2 comments

This is relying on the https://en.wikipedia.org/wiki/Chinese_remainder_theorem

The idea is that you will find a place in the tree for your given number at a relatively shallow depth, because every number in the ring Z/p1×p2×..pn×Z has a unique list of remainders in the separate rings Z/pnZ. As long as the product of the primes is larger than the size of your key space, then you won’t get any collisions.

I’m not sure using consecutive primes starting from 2 is actually the best choice though. You could use any primes you like (or for that matter any collection of factors which are all coprime), so long as their product is more than the largest possible key. The choice of factors should probably be tweaked based on benchmarks in real-world use. For instance, the factors could be 25=5², 29, 31, 32=2⁵, 33=3×11, 37.

[Excuse the lack of subscripts in the notation here; no universally installed fonts contain a subscript n, as far as I know. Likewise I’d prefer to use ℤ rather than Z, but I’m afraid it might show up as little boxes for some readers.]

When you use primes for your modulo, you get unique representation. This is a result of the Chinese remainder theorem. If my key is k, then calculating

    k mod 2
    k mod 3
    k mod 5
    ...
    k mod 29
uniquely determines k mod 2* 3* ...* 29. This way there are no collisions. Any set of pairwise relatively prime moduli would work, but using using primes is simple and it's easy to extend (just add the next prime to the list moduli).