Hacker News new | ask | show | jobs
by jkahn 4987 days ago
Thanks for the article. What do you use to measure your PostgreSQL performance? Do you use any visual dashboards to track down bottlenecks, etc?
2 comments

If you are lucky enough to be able to use Postgres 9.2, consider running pg_stat_statements all the time. It is one of my favorite pieces of work: it canonicalizes queries and can tell you a number of useful statistics on it.

One of the advantages is that it can help you identify queries that are both very short but frequent (and could use a cache, or whatnot), and would fly below the radar of most log aggregations, because most people turn off logging off for very short statements. See also: identifying n+1 query pathologies.

It really blows away log analysis and pg_fouine for all but the most heavyweight tasks, and its approach is a lot more sound than log analysis tools because it uses the semantic representation -- not the syntactic one -- of the query submitted.

If you need even more fine-grained statistics, check out https://github.com/johto/call_graph

It allows you to get statistics per top-level function, which can reveal abnormalities otherwise impossible to detect, such as a function which is usually very fast, but very slow for a small number of calls relative to the total number of calls. Consider fa()->fx() and fb()->fx(), if fa()->fx() always is fast and you have a million such calls, but for some reason fb()->fx() is very slow, but you only have a few such calls, fx() will look like a very fast function without any performance problems, if you only look at pg_catalog.pg_stat_user_functions, but call_graph will reveal fx() being slow when called by the top-level function fb().

This is only relevant for systems always accessing the database through Stored Procedures though.

One of the best metrics is the log of slow queries. We set our threshold at 250 milliseconds, and we investigate queries that take longer. Sometimes, we'll see INSERT or COMMIT statements show up, which is usually an indication that our write performance isn't what it needs to be. We generate graphs from these logs (using graphite) and track them over time. We also use Munin (including the PostgreSQL plugins) to monitor disk IO and various PostgreSQL stats.