|
|
|
|
|
by mihailstoian
138 days ago
|
|
If you're referring to estimating join sizes, i.e., the stuff you have to estimate before you actually build the query plan, we're _almost_ there (but not yet). Do check out the following papers that show that you can obtain provable bounds on your join sizes. Basically, given a SQL query, they'll tell you how many tuples (max and min, respectively) the query will return. 1. LpBound: join size upper bounds. It still doesn't have full SQL coverage, e.g., string predicates, window functions, subqueries etc., but as with all cool stuff, it takes time to build it. 2. xBound: join size lower bounds. We showed how to do it at least for multi-way joins on the same join key, e.g., many subexpressions of the JOB benchmark have this shape. Still open how to do the rest - I'd say even harder than for upper bounds! (NB: I'm an author.) [1] LpBound: https://arxiv.org/abs/2502.05912 [2] xBound: https://arxiv.org/abs/2601.13117 |
|