Hacker News new | ask | show | jobs
by hendzen 4011 days ago
...Except the number of query plans is O(n!), where n is the number of joins.

MongoDB's query optimizer actually works like this, which is only tractable because it doesn't support joins.

1 comments

I was about to write "Ummm...column order doesn't matter, so it's O(2^n)", but then I did some thinking and a bit of googling, and I guess it does. Yeah, O(n!).

Although, in practice, most tables would only have a relatively small number of columns upon which you'd want to consider building an index (usually on columns (often IDs) that you'd use to join to other tables), so that'd cut it down to O(m!) where m < n, and for larger tables, m << n.

<thinking out loud> Hmmmm, I wonder if you could use frequency information--or even the proportion of distinct values--of columns to guess a bit more intelligently... </thinking out loud>

Perhaps one could even use histograms to estimate selectivity of the query's join predicates to pick a good ordering of joins?

This is actually pretty much how a basic relational query optimizer works in practice.