| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by takeda 2484 days ago

The column can have nulls but after a quick check there's only a single record with a null value (probably added as a test, no idea).

In the bad plan a sequential scan was performed and then it removed over 4 million rows (Rows Removed by Filter: 4562849)

Edit: here is the query planner's output: https://explain.depesz.com/s/eVXI

After rewriting the query (change IN into ANY(VALUES( and adding an index to the "seven" table): https://explain.depesz.com/s/b4kU

1 comments

jtc331 2478 days ago

I think adding the index to the “seven” table is probably the more meaningful change.

You said the index was ignored, but based on that plan output the planner made the right choice since the filter on the “sierra_zulu” table (only one there with an ANY so I assume that’s the one) matches over 90% of the rows. An index scan on that would be far worse.

Adding an index on a different table allowed the driving side of the join to be different, and, with already prefiltered data allowed the other index to be useful as well (fewer index lookups). I’d bet with that index addition your IN clause works just fine.

takeda 2478 days ago

If you look at sierra_zulu (line #3 on both plans) it took almost 22 seconds initially and after query rewrite (and no index change) took 713ms.

So that change of query alone reduced query time from 35s to ~5s, after that the seven table become the bottleneck and adding an index there reduced the whole query down to 750ms.

I should have saved the intermediate plan to illustrate it, but it essentially looked same as the second plan, except there was a sequence scan on seven table.