Hacker News new | ask | show | jobs
by georgewfraser 2783 days ago
Is the cost based planner relevant to OLTP-style queries, where everything is an indexed lookup, or only to OLAP-style queries that involve scans?
1 comments

Sure, there are plenty of OLTP queries involving joins and complex queries that can have severe performance impacts from bad query plans.
I’m struggling to think of an example of an OLTP query where simple heuristics don’t work but I’m not an expert, can you give an example?
SELECT * FROM a, b WHERE a.x = b.x AND a.y='foo' AND b.z='bar'

The best plan for this simple query depends on the selectivity of the predicates. For example, if 20k rows have a.y='foo', but only 10 rows have b.z='bar', then it's best to scan a b.z index, then lookup matching rows in a. But if the #rows is reversed, then it's better to scan an a.y index and lookup in b. This is a simplified example, but we do see queries along these lines in real OLTP workloads.

You're correct that for many OLTP workload queries, simple heuristics are sufficient. However, even if that's true for 90% of queries, it's the last 10% that gets you. We've seen customers with 10 queries in their workload, where 9 work well but the last 1 gets a bad plan that is 10x slower than it could/should be. Maybe they can rewrite it, or maybe they don't have sufficient knowledge to do so. Or perhaps they're using an ORM and don't have control over the queries it's sending to the DB. In addition, many mostly-OLTP workloads contain a few OLAP reporting queries in the mix. Developers don't expect their OLTP database to perform like a dedicated OLAP data warehouse DB, but they also expect it not to fall over when it gets a more complex query.

That’s a great explanation! Thank you.