| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stvltvs 993 days ago
	Not if the dev has removed the context of which table the id values come from.

1 comments

muzaffarpur 993 days ago

Can you explain bit further. If the subquery do not have context where the value comes from, I would think query will error out?

link

josephg 993 days ago

Lots of application developers don’t really understand how sql works. So instead of learning to do a join (or do a sub query or anything like that) they just issue multiple queries from their application backend.

The posters up thread are talking about novice Python / JavaScript / whatever code which issues queries in sequence: first a query to get the IDs. Then once that query comes back to the application, the IDs are passed back to the database for a single query of a different table.

The query optimizer can’t help here because it doesn’t know what the IDs are used for. It just sees a standalone query for a bunch of IDs. Then later, an unrelated query which happens to use those same IDs to query a different table.

link

10000truths 993 days ago

What you're saying is that:

  SELECT * FROM table2
  WHERE table1_id in (SELECT ids FROM table1 WHERE ...)

will be optimized to a join by the query planner (which may or may not be true, depending on how confident you are in the stability of the implementation details of your RDBMS's query planner). But in most circumstances, there is no subquery, it's more like:

  SELECT * FROM table2
  WHERE table1_id in (452345, 529872, 120395, ...)

Where the list of IDs are fetched from an API call to some microservice, or from a poorly used/designed ORM library.

link

mr_toad 993 days ago

What’s really bad is where they (or the ORM) generate a seperate select statement for each ID in a loop. Selecting thousands of rows one row at a time from an unindexed table can be hilariously slow.

link

sgarland 992 days ago

And this doesn’t even get into bizarre edge cases, like how Postgres treats foo = ANY(ARRAY[‘bar’, ‘baz’]) differently than foo IN (VALUES((‘bar’), (‘baz’))). PROTIP: the latter is often MUCH faster. I’ve learned this twice this week.

link