Hacker News new | ask | show | jobs
by perrygeo 820 days ago
More philosophically, the query planner is a finely-tuned piece of engineering. If your mental model disagrees with it ("Why isn't it using the index I want?") then it's highly likely your mental model is wrong.

Your data model might contain obvious mistakes. Statistics can be out of date too, if e.g. a table was bulk loaded and never analyzed. `ANALYZE tablename` done. Sometimes removing unused indexes can improve things. TLDR; it's always something that you need to tune in your own database. When in doubt, the query planner is right and you are wrong. Good engineering means having the intellectual curiosity to exhaust these possibility before resorting to hints.

Hints are an extreme measure. You're basically saying that you know better than the query planner, for now until eternity, and you choose to optimize by hand. That may be the case but it requires detailed knowledge of your table and access patterns. The vast majority of misbehaving query plans just need updated statistics or a better index.

6 comments

Queries are complex and the query planner is using heuristics that might just not fit your situation in some cases. The query planner is great for 99.99% of queries but a super small number of edge cases will need tuning at the same time.

Finding out what went wrong in the query plan by looking at optimizer traces is a lot of work. I did so recently and the trace alone was 317MB.

This is not a correct assumption with Postgres. Its statistics collection process has fundamental flaws when tables become large such that the statistics it uses for optimization are in no way representative of the actual data, leading to the situation this thread is about.

Statistics collection is a weak spot in Postgres and query optimization relies on that information to do its job.

> you know better than the query planner, for now until eternity,

Nope, you just have to know it's fixing a real problem today.

Having a query regress in performance below a KPI would be worse than not taking advantage of a further optimization in the future, due to out of date hint.

> Good engineering means having the intellectual curiosity to exhaust these possibility before resorting to hints

Why is that better?

Luckily we don't have to rely on such grandiose claims. Just try it out. If you find a query that you can tune better than the planner for your data set, then it's a better outcome.

> Good engineering means having the intellectual curiosity to exhaust these possibility before resorting to hints.

While it's great that "good engineering" exists, it often requires a bunch more time and effort than people have for the task right then.

Being inflexible and always demanding that time is taken is an extremely poor approach, and often doesn't lead to a good quality result.

Instead it often leads to the inflexible thing being retired and a more flexible alternative being used from then on.

> Hints are an extreme measure. You're basically saying that you know better than the query planner

Absolutely.

A query planner does not analyze the complete and precise solution space, none of them do. The query planner will be extremely wrong sometimes. The only logical solution is to provide a mechanism to guide the planner towards the correct solution.

I've worked with DB2, Oracle and MS SQL Server over the last 3 decades and the story is always the same, the planner is not perfect.

> You're basically saying that you know better than the query planner, for now until eternity, and you choose to optimize by hand.

Yes I do know better what data is going to be in the database I am managing that some heuristic mechanism.