|
|
|
|
|
by sgarland
4 days ago
|
|
> using joins just isn’t feasible That sounds like you’re either joining on an unindexed column, or have outdated statistics. 8 queries implies 8 tables, which is well under the limit for both Postgres and MySQL where the planner may give up and choose a suboptimal plan. You may have something like “SELECT * FROM foo JOIN bar ON foo.id = bar.foo_id WHERE bar.baz = 'qux'”, and if there’s no composite index on (bar.foo_id, bar.baz), the planner will choose whichever column it thinks is more selective; it then has to go get the value for the other one, and that can be quite expensive at scale. Even if you have a separate index on each of those, there’s no guarantee the planner will decide to merge them. > Yeah but the increased I/O is cheaper. It's easier to add another webserver as opposed to upgrading your db server. I’m referring to I/O on the DB. Rows are stored in pages that are generally 8 KiB (Postgres and MS SQL Server default) or 16 KiB (InnoDB default). If you can fit 200 rows per page, a given query will probably have to fetch fewer pages than if you can only fit 100 rows per page. |
|
I'm not sure thats true for Postgres. optimal join ordering is np-hard, and finding an optimal join requires exhaustive search through n! combinations (n=number of joins) - thats why postgres generally uses heuristics to figure out join order. 8 is also the default value of "join_collapse_limit" setting in postgres, so it can't ever reliably optimize over 8 joins at a time. Additionally, postgres starts using "genetic algorithms" aka testing random combinations of joins with 12 joins by default (geqo_threshold setting).
I generally agree its better to use database to its fullest, but I would say 8 joins is probably the "limit". Internally at work I've advised teams to try to avoid anything more than 6 joins for "hot-path" queries.