|
|
|
|
|
by charukiewicz
1913 days ago
|
|
Author here. Between this and your other response, where you expound on the same point, I think you're being far too hand wavy about what causes performance issues. The number of joins alone doesn't have much to do with the performance characteristics of any query. What's more important for performance of queries on larger data sets than the number of joins is that there are indexes and that the query is written in a way that can utilize indexes to avoid multiplicative slowdowns. The reason the UNION query is fast is because the query on either side effectively utilizes the indexes so that the database engine can limit the number of rows immediately, rather than filter them out after multiplying them all together. I can expand this schema to have a UNION query with two 10-table joins and it would still perform better than the 7 table query. I think someone new to SQL is likely to read your statement and think "okay joins are slow so I guess I should avoid joins". This is not true and this belief that joins are slow leads people down the path that ends at "SQL just doesn't scale" and "let's build a complicated Redis-backed caching layer". SQL performance is a complex topic. The point of our post was to illustrate that a UNION query can simplify how your join your tables and allow you to write constituent queries that have better performance characteristics. Morphing this into "the number of joins is smaller so the performance is better" is just incorrect. |
|
If the article had, instead, listed indexes, shown they were used in simple cases, shown they weren't used in the second query, dug into why they weren't (maybe they were but it was still hella slow) - that would be a ton of value!