Hacker News new | ask | show | jobs
by topspin 3411 days ago
I'm replying due to this same part/paragraph. I've been dealing with some manifestation of EXPLAIN since Oracle 6. In the last few years I've become handy with PostgreSQL's EXPLAIN as well, and this reminds me of my biggest hang-up regarding PostgreSQL; their hostility toward optimizer hints.

Like Mr. Stein I too have found myself in bad places with PostgreSQL's optimizer. This is commonplace with relational systems; every such system I've ever dealt with, including all versions of Oracle since the mid 90's, Informix, MS-SQL, DB/2 (on AS/400, Windows and Linux,) and PostgreSQL eventually get handed a query and a schema that produces the wrong plan and has intolerably bad performance. No exception. None of these attempts to create flawless optimizers that anticipate every use case has ever succeeded, PostgreSQL included.

With other systems there are hints that, as a last resort, you can apply to get efficient results. Not so much with PostgreSQL. Not implementing the sort of hints that solve these problems (as opposed to the often ineffectual enable_* planner configuration, unacceptable global configuration and other workarounds needed with PostgreSQL) is policy:

"We are not interested in implementing hints in the exact ways they are commonly implemented on other databases. Proposals based on 'because they've got them' will not be welcomed."

How about proposals based on "because your hint-free optimizer gets it wrong and I require a working solution without too many backflips and somersaults or database design lectures." No? Then sorry; I can't risk getting painted into a corner by your narrow minded and naive policy. PostgreSQL goes no further than non-critical, ancillary systems when I have say in it. And I do.

2 comments

You're absolutely correct. Optimizer hints are a source of contention in Postgres. I'm sure you're already aware, but just for completeness, here's some more information on optimizer hints and Postgres from Simon Riggs back in 2011. I don't expect this to sway opinion one way or the other. It's useful to read more on Postgres' policy.

http://blog.2ndquadrant.com/hinting_at_postgresql/

From the introduction:

- Introducing hints is a common source of later problems, because fixing a query place once in a special case isn’t a very robust approach. As your data set grows, and possibly changes distribution as well, the idea you hinted toward when it was small can become an increasingly bad idea.

- Adding a useful hint interface would complicate the optimizer code, which is difficult enough to maintain as it is. Part of the reason PostgreSQL works as well as it does running queries is because feel-good code (“we can check off hinting on our vendor comparison feature list!”) that doesn’t actually pay for itself, in terms of making the database better enough to justify its continued maintenance, is rejected by policy. If it doesn’t work, it won’t get added. And when evaluated objectively, hints are on average a problem rather than a solution.

- The sort of problems that hints work can be optimizer bugs. The PostgreSQL community responds to true bugs in the optimizer faster than anyone else in the industry. Ask around and you don’t have to meet many PostgreSQL users before finding one who has reported a bug and watched it get fixed by the next day.

Now, the main completely valid response to finding out hints are missing, normally from DBAs who are used to them, is “well how do I handle an optimizer bug when I do run into it?” Like all tech work nowadays, there’s usually huge pressure to get the quickest possible fix when a bad query problem pops up....

Having never used postgres, this seems extremely scary. I've only ever needed index hints a couple times in other databases, but when you need them there isn't usually an alternative. Messing with global knobs is a good way to cause more problems than you are solving.

I've even entertained the idea that every query should be hinted. For OLTP workloads, you practically always know exactly how you want the DB to execute your query anyways. And often times you find out very late that the query planner made the wrong choice and now your query is taking orders of magnitude longer than it should (worse, sometimes this changes at runtime). I've never actually gone through with this religiously though...

>> Having never used postgres, this seems extremely scary. I've only ever needed index hints a couple times in other databases, but when you need them there isn't usually an alternative. Messing with global knobs is a good way to cause more problems than you are solving.

You've got the plot exactly. The last such battle I was involved with ended in creating a materialized view to substitute for several tables in a larger join; without the view there was no way[1] to get an acceptable plan. Creating this view was effectively just a form of programming our own planner. And yes, the need to update the view to get the desired result is an ongoing problem; one that's scheduled to get solved with a migration to another DB.

Like you I've never been all that quick to employ hints. I tend to use them while experimenting during development or troubleshooting and avoid them in production code. But there have been production uses, and you know what? The world did not end. No one laughed at or fired me. No regulatory agency fined me. It did not get posted on Daily WTF. No subsequent maintenance programmer has ever shown up at my home in the dead of night. It just solved the problem, quickly and effectively.

Sure would be nice if people purporting to offer a fit-for-purpose relational systems understood the value of a little pragmatism.

[1] given the finite amount of time we could sacrifice to deal with it

Just to be clear, those settings are global per query, not global for the entire server. Still that makes them almost useless for large queries, but I would not exactly call them scary.