| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Nextgrid 1161 days ago
	Out of curiosity, if the problem of connections being expensive is solvable by PGBouncer-style connection multiplexing, why doesn't Postgres just do that by itself?

5 comments

anarazel 1161 days ago

If it were easy to do well in-core, I think we'd do it immediately. Unfortunately the interesting pooling modes in pgbouncer also break a few things - which would likely not be acceptable in core postgres. Avoiding such breakage requires non-trivial architectural adjustments.

Historically connection state and "process state" have been tightly coupled, for good server-side pooling they have to be divorced. While good pooling is doable with the current process model (passing the client file descriptor between processes using SCM_RIGHTS), it's much harder with processes than with threads - this is one of the reasons I think we will eventually need to migrate to threads.

Eventually I want to get to a point where we have a limited number of "query execution workers" that handle query execution, utilized by a much larger number of client connections (which do not have dedicated threads each). Obviously it's a long way to go to that. Ah, the fun working on an complicated application with a ~35 year history.

There also are use cases for pgbouncer that cannot be addressed on the server-side - one important one is to run pgbouncer on "application servers", to reduce the TCP+TLS connection establishment overhead and to share connections between application processes / threads. That can yield very substantial performance gains - completely independent of server side pooling support.

link

dingledork69 1161 days ago

Because pgbouncer's transaction-based pooling, which is what the previous poster was referring to, breaks a few postgres features. This is fine for most applications, but not all. See the table on https://www.pgbouncer.org/features.html

link

Tostino 1156 days ago

I really wish there was a way to handle transaction pooling without losing session pooling features.

Essentially, a "please serialize everything (temp tables, SET GUC values, etc) from this session to disk and load it back when necessary".

link

dalyons 1161 days ago

yeah good callout. I'd would be totally fine with a server mode, or connection option, that opts you into pooling but then disables the incompatible features. You choose pooled mode if you need it, and the tradeoffs are documented.

link

dalyons 1161 days ago

I believe there has been talk and perhaps progress towards building it into PG, but i cant find the dev thread right now, and i dont know what the status of that is.

interestingly enough this is what Oracle does AFAIK. They are also process-per-conn & have an optional sidecar proxy thingy that you can run on your oracle host to do the pooling. I would rather it be built more tightly into the rdbms but thats not a terrible solution.

link

danmcs 1161 days ago

There were changes made to idle sessions in postgres 14.0 that were supposed to reduce the resource usage of open but idle connections.

Crunchydata mentioned it on their blog a while back (https://www.crunchydata.com/blog/five-tips-for-a-healthier-p...) and the pg 14 release notes mention a few changes to idle sessions (https://www.postgresql.org/docs/release/14.0/)

I don't know if they were sufficient that pgbouncer is no longer necessary, haven't had a need to try it.

link

jake_morrison 1161 days ago

Postgres forks an OS process for each connection, which is relatively heavy weight. Oracle has a similar architecture to Postgres, and they solved it with a "multi-threaded listener". MySQL, in contrast, uses threads, which makes connections lighter weight.

It's fundamentally a question of how the connection listener communicates with the rest of the database, e.g., using shared memory or some other IPC mechanism, work queues, etc. Having too many connections results in problems with concurrent access and lock contention independent of how heavyweight the actual listening process is.

link