|
|
|
|
|
by ibdknox
3219 days ago
|
|
Is there anything written about your multi-way join algorithm? We've done some really interesting implementations of Generic Join and the like and are always interested in hearing about other potentially better join algorithms. |
|
The case for triangle joins is simple to describe, though. Suppose we have attributes A, B and C and finite maps (relations) R : A x B -> X, S : A x C -> Y and T : B x C -> Z. Now build tries for each so we have
abx : Trie A (Trie B X)
acy : Trie A (Trie C Y)
bcz : Trie B (Trie C Z)
We can join two tries using join : Trie k v1 -> Trie k v2 -> (v1 -> v2 -> v) -> Trie k v. Putting this all together we have:
triangleJoin : Trie A (Trie B (Trie C (X, Y, Z)))
triangleJoin = join abx acy (\bx cy -> join bx bcz (\x cz -> join cy cz (\y z -> (x, y, z))))
Compared to a database that doesn't know how to handle triangle joins the difference is absolutely stunning. The above one-liner managed to do in 5 seconds what took MySQL over an hour...