Hacker News new | ask | show | jobs
by cryptonector 336 days ago
> At the data scale + level of complexity our OLAP queries operate at, we very often run into situations where Postgres's very best plan [with a well-considered schema, with great indexes and statistics, and after tons of tuning and coaxing], still does something literally interminable — not for any semantic reason to do with the query plan, but rather due to how Postgres's architecture executes the query plan[1].

Well, ok, this is a problem, and I have run into it myself. That's not a reason for not wanting a QL. It's a reason for wanting a way to improve the query planning. Query hints in the QL are a bad idea for several reasons. What I would like instead is out-of-band query hints that I can provide along with my query (though obviously only when using APIs rather than `psql`; for `psql` one would have to provide the hints via some \hints commnad) where I would address each table source using names/aliases for the table source / join, and names for subqueries, and so really something like a path through the query and subqueries like `.<sub_query_alias0>.<sub_query_alias1>.<..>.<sub_query_aliasN>.<table_source_alias>` and where the hint would indicate things like what sub-query plan type to use and what index to use.

1 comments

I mean, in my case, I don't think what I want could be implemented via query hints. The types of things I would want to communicate to the server, are pragmas entirely divorced from the semantics of SQL: pragmas that only make sense if you can force the query's plan to take a specific shape to begin with, because you're trying to tune specific knobs on specific plan nodes, so if those plan nodes aren't part of the final query, then your tuning is meaningless.

And if you're pinning the query plan to a specific shape, then there's really no point in sending SQL + hints; you may as well just expose a lower-level "query-execution-engine abstract-machine bytecode" that the user can submit, to be translated in a very low-level — but contractual! — way into a query plan. Or, one step further, into the thing a query plan does, skipping the plan-node-graph abstraction entirely in favor of arbitrarily calling the same primitives the plan nodes call [in a sandboxed way, because such bytecode should be low-level enough that it can encode invalid operation sequences that will crash the PG connection backend — and this is fine, the user signed up for that; they just want to be assured that such a crash won't affect data integrity outside the current transaction.]

Such a bytecode wouldn't have to be used as the literal compiled internal representation of SQL within the server, mind you. (It'd be ideal if it was, but it doesn't need to be.) Just like e.g. the published and versioned JVM bytecode spec isn't 1:1 with the bytecode ISA the JVM actually uses as its in-memory representation for interpretation — there's module-load-time translation/compilation from the stable public format, to the current internal format.

But your mental model of your query is still in a language, even if it's only natural language. Why wouldn't you write a QL and compiler for it that outputs a query plan AST/bytecode/whatever to your liking? The PG SQL compiler and query planner just isn't to your liking, but you really want to be writing queries by hand? I guess what you're saying is you want something like LinkQ that lets you build complex plans/ASTs w/o the complexity of NoSQL hand-coded queries.