|
|
|
|
|
by btilly
3822 days ago
|
|
It may not look like a sexy marketing feature. But it is the #1 reason why I don't recommend PostgreSQL. I also don't think that query hints are a good way to do it. And I don't mind if the way to do it is somewhat cumbersome. This is very much a case where 20% of the work can give 99% of the benefit. For example what about the following approach? 1. Add an option to EXPLAIN that will cause PostgreSQL's optimizer to spit out multiple plans it considered, with costs, and with a description of the plan that PostgreSQL can easily parse and fit to a query. 2. Add a PLAN command that can be applied to a prepared statement and will set its plan. It is an error to submit a plan that does not match the query. And now in the rare case where I don't like a query's plan I can: EXPLAIN PLANS=3 (query);
Pick my desired plan from the list (hopefully)Then in my code I: PREPARE foo AS (query);
PLAN foo (selected plan);
EXECUTE foo;
And now if I notice that a query performs worse than I think it should, I can make it do what I want it to. |
|
The biggest problem with that approach is that the way query planning works isn't that a 100 different plans are fully built, cost evaluated, and then compared. Instead it's more like a dynamic programming approach where you iteratively build pieces of a query plan from ground up, and then combine those pieces to build the layer one up. Given the space of possible query plans, especially with several relations and indexes on each relation involved, such an approach is required to actually ever finish planning.
> Add a PLAN command that can be applied to a prepared statement and will set its plan. It is an error to submit a plan that does not match the query.
It's not easily, if at all in the generic case!, possible to prove that a specific plan matches a query. You could obviously try to build every possible plan and match against each of those, but that's computationally infeasible (we're talking factorial number of plans, depending on relations here).
So I think such an approach has no chance of working.
What's more realistic is a running queries in a "training" mode. That training mode would, matching on the specific parsetree, store the resulting plans in a table. Before exiting training mode you'd mark all these plans approved (after looking for bad cases obviously). After that preparing a new query still does the original query planning, but by default the query stored in the "approved plans" table would be used. The cost differential and the new plan would then be associated with the currently approved plan. Regularly the DBA (or whoever fulfills that role), checks the potential plans and approves new ones.
Based on a configuration option queries without approved plans would error out, raise a log message, or just work.
Now even that has significant problems because e.g. DDL will have the tendency to "invalidate" all the approved plans. But that's manageable in comparison to being woken up Friday night.