Hacker News new | ask | show | jobs
by rossmohax 1018 days ago
Please use neither. Reason is as number of lookups varies, you'll have to template query as a string adding either values to IN list or `OR` conditions. Not only it causes excessive string concatenation on your app side, these queries are also seen as distinct queries which has following drawbacks:

- every query has to be planned

- your DB driver can't prepare query as they are all different

- collecting per query stats becomes nightmare if number of arguments per query varies in wide range. metrics cardinality is a problem.

Correct way to handle it is pass all args as single parameter of type array and use `= ANY($1)` or if there are multiple columns build a virtual table and join:

  SELECT a,b FROM table 
  NATURAL JOIN ROWS FROM (
    unnest($1::type_of_a[]),
    unnest($2::type_of_b[])
  ) t(a,b)
1 comments

>every query has to be planned

I thought PG planned every query anyway, it does not do plan caching.

It does, but AFAIk only for prepapred queries and cache is local to the backend serving query. See `plan_cache_mode` param.
Which has very limited usefulness since it only last for the session, so it helps if you are say doing the same query over and over on the same connection like bulk inserts but does not help for say the same select use over different requests where the connection is closed or given back to pool on each request which is far more common and would probably line up with the or/in issue here.

I don't even think PG caches plans for functions/sprocs. Back in the day with MSSQL before it had full plan caching based on statement text you could at least make sprocs for common queries and get plan reuse which had a large impact on perf in many apps.

Typically you try and avoid closing a connection on each request, you hand it back to a connection pool. The underlying session stays open and associated with the connection (if you have your pooler set up right), so subsequent requests still use the cache.

I agree with the original commenter about ANY as well: using IN for dynamic lists of parameters makes viewing useful information in e.g pg_stat_statements impossible, though it's possible there's been some recent work around normalizing these.