| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by 10000truths 993 days ago

What you're saying is that:

  SELECT * FROM table2
  WHERE table1_id in (SELECT ids FROM table1 WHERE ...)

will be optimized to a join by the query planner (which may or may not be true, depending on how confident you are in the stability of the implementation details of your RDBMS's query planner). But in most circumstances, there is no subquery, it's more like:

  SELECT * FROM table2
  WHERE table1_id in (452345, 529872, 120395, ...)

Where the list of IDs are fetched from an API call to some microservice, or from a poorly used/designed ORM library.

1 comments

mr_toad 993 days ago

What’s really bad is where they (or the ORM) generate a seperate select statement for each ID in a loop. Selecting thousands of rows one row at a time from an unindexed table can be hilariously slow.

link

sgarland 992 days ago

And this doesn’t even get into bizarre edge cases, like how Postgres treats foo = ANY(ARRAY[‘bar’, ‘baz’]) differently than foo IN (VALUES((‘bar’), (‘baz’))). PROTIP: the latter is often MUCH faster. I’ve learned this twice this week.

link