Hacker News new | ask | show | jobs
by btilly 3822 days ago
You just use the approach used in simulated annealing. You set a threshold for making random decisions, and adjust that probability up and down as you try to optimize. For example you might say that in each step there is a 10% chance of adjusting the COST factor randomly by a factor between 0.2 to 5. If you've got 3 tables, most of the time you will come to the current optimal decision. Most of the rest of the time you will make one suboptimal choice. Run that a few times and you'll should several plans that are suboptimal but not actually horrible. Keep running and playing with the randomness factors until you either have enough plans, or are basically making entirely random choices and can't get enough.

As to the algorithm described in the comment, it is actually able to search through all plans. The dynamic programming bit just means that you don't have to unnecessarily traverse all possible logic paths that a naive recursive algorithm might. But if there is a good plan, it can be discovered.

It is unclear from the comment whether the algorithm can only find plans which involve adding one table at a time. This would not be unreasonable - for example I know that Oracle circa a decade ago did that to avoid having to consider too many query plans on large joins. I have no idea whether that ever got changed.

As to the parse tree, that seems like overkill to me. You have a query. It has a list of tables, and a list of join conditions. A query plan includes tables, indexes, filters, and so on. It could easily be augmented by stating which query condition was applied where.

To validate it you have to see that the lists of tables match, the list of applied conditions match, and each plan step could actually have the effect of doing that condition. If it all ties out, it is a valid plan. The fact that the plan might be crap is something that is explicitly left in the hands of the user - that's the whole point of the feature. But at that point it is clearly valid.