Hacker News new | ask | show | jobs
by anarazel 2052 days ago
What kind of in information would you like to see in that trace?

Not planning to work on it directly, but I am planning to do some larger executor work as one of the next bigger projects, and having ideas about what kind of information people would like to see could make it easier to implement them later.

1 comments

A trace should answer why a query is fast or slow. It should provide enough information to a dba (which I’m not) to determine in which domain the most likely optimization is possible. I.e. the query itself, db configuration and tuning or underlying infra.
Unfortunately some of the data for that isn't cheap to collect. Postgres' EXPLAIN (ANALYZE, SETTINGS) <query>, which shows a good bit of what you want, can slow queries down substantially (in some workloads). Primarily because of timestamp overhead.

Just always collecting that information just in case it may get accessed thus isn't really feasible. It'd be good to make it possible to query cheaper information on-demand though (e.g. asking for the EXPLAIN of a query running in another session, without analyze, should be doable with some effort).

You can already set up things in a way that allows to correlate connections / queries with distributed tracing. But it's a more work than it should be. Postgres' pg_stat_activity shows queries, and it can include information that allows to correlate in the connection's 'application_name'.

The point the OP was trying to make I believe, is not that it should do all of that, but that if it doesn't do that a different name to observability should be chosen such as monitoring.
"It would be great if Postgres could emit a trace per query, showing in real-time which internal components were hit by this query. A sort of continuous query explain service."
I think you can achieve this with pg_stat_statements and auto_explain.