Hacker News new | ask | show | jobs
by samtho 1029 days ago
The results seem kind of obvious. The OR operator has the lowest precedence in SQL, requires a full statement after as well as continually widens the “view” that the query will ultimately see as a result of the union between all matching result sets. It does not imply the same things as an equivalent OR operator in a procedural language. The IN operator takes a single field and internally matched to all rows that match the given list of values.

Granted, SQL is different from typical procedural/OO/functional languages in the implication that the engine will create the “correct” internal representation of the query. I’m sure DBMSes can heuristically identify the case where somewhere just chains ORs whos only statement is comparing equal a literal value to the same field more than once, and collapse it to an IN. But why? If IN exists, why should the engine care about optimizing OR chains?

1 comments

I was actually surprised that it mattered. I had expected the query optimizer to figure out the logical equivalent between the two and then pick the most optimal one for me.
Same here. I would have expected the OR an IN versions to generate the same query plan, it's seems like low hanging fruit for the query optimizer.