Take 10 consecutive prime numbers:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29
And they can distinguish all uint32 numbers:
2*3*5*7*11*13*17*19*23*29 > ^uint32(0)
Huh? How do I represent 31 (or any prime greater than 29)?
You don't represent primes, the primes are used to split the keys up and decide where in the tree it goes.
e.g. to represent 31, you take 31%2=1, so you use child 1 of the root node. If the root node already has child 1, you take 31%3=1 and use child 1 of that node. And so on.
Oh, I see. It's like a B-tree with a larger node size at each level, yes? But then what's the advantage of basing the node size on primes? Wouldn't you get the same advantages with simpler code by using successive powers of two at each level instead of successive primes?
The idea is that you will find a place in the tree for your given number at a relatively shallow depth, because every number in the ring Z/p1×p2×..pn×Z has a unique list of remainders in the separate rings Z/pnZ. As long as the product of the primes is larger than the size of your key space, then you won’t get any collisions.
I’m not sure using consecutive primes starting from 2 is actually the best choice though. You could use any primes you like (or for that matter any collection of factors which are all coprime), so long as their product is more than the largest possible key. The choice of factors should probably be tweaked based on benchmarks in real-world use. For instance, the factors could be 25=5², 29, 31, 32=2⁵, 33=3×11, 37.
[Excuse the lack of subscripts in the notation here; no universally installed fonts contain a subscript n, as far as I know. Likewise I’d prefer to use ℤ rather than Z, but I’m afraid it might show up as little boxes for some readers.]
When you use primes for your modulo, you get unique representation. This is a result of the Chinese remainder theorem. If my key is k, then calculating
k mod 2
k mod 3
k mod 5
...
k mod 29
uniquely determines k mod 2* 3* ...* 29. This way there are no collisions. Any set of pairwise relatively prime moduli would work, but using using primes is simple and it's easy to extend (just add the next prime to the list moduli).
This would then dictate the location in the hash tree (depending on how deep it is) of any entry to hashes to 31 or to those values modulo those primes.
e.g. to represent 31, you take 31%2=1, so you use child 1 of the root node. If the root node already has child 1, you take 31%3=1 and use child 1 of that node. And so on.