|
|
|
|
|
by munk-a
1844 days ago
|
|
The article touched on some caveats but missed what I think is a big one - you really want to capture any detailed explains from environments as close to production as possible. Different table statistics can cause the planner to go in wildly different directions and while faster is always better it is very easy to accidentally get caught up trying to sink a lot of effort into making a query more performant that was running slow due to thrashing in ram on a dev-box. Explain (analyze at least - which you should always use) is a lot less theoretical than you might assume. That can make it a bit more onerous to execute but it ends up adding a lot of value to the statistics when you gather them. Oh also - query caching on postgres is a thing so if you're worried about performance from a cold state don't forget to clear caches before executing. And if anyone has any good suggestions around tools to screw up table statistics I haven't found a good one that I like yet. |
|
Exactly. That's why my team and I (Postgres.ai) have developed Database Lab Engine [1] and a chatops tool for SQL optimization, Joe bot [2], both are open-source (AGPLv3).
EXPLAIN (ANALYZE, BUFFERS) has to be executed on the same-size DB, with properly adjusted Postgres configuration.
Interesting, that the machine you might using for query plan troubleshooting, can have less RAM and different hardware in general – it doesn't matter for the planner. Even shared_buffers doesn't matter – you can set effective_cache_size matching production (this trick we use in Database Lab when hardware is weaker than on production).
As for the cache states – very good point as well. I'm advocating for buffers- or rows-centric approach: first, optimization should be done to reduce the numbers of buffers or, if you're working with "logical" (dump/restore) copy of the database rather than "physical" (PGDATA copy, keeping the same data layout, including bloat, etc.) – the fewer the numbers, the better. Only then, you pay attention to timing – and keep in mind what can happen under the worst conditions (everything is read from disk), if it makes sense.
[1] https://postgres.ai/products/how-it-works
[2] https://postgres.ai/products/joe