Hacker News new | ask | show | jobs
by brazzy 1022 days ago
The point is that if you have 50 k8s pods that each have their individual connection pool, some of them will be holding idle connections while others are hitting their max connection limit. A single pool is much more flexible.

Additionally, the "transaction" mode of PgBouncer can increase the utilization of connections further by making them available to a different application when one application holds a connection while doing something different (e.g. waiting for a call to an external service).

4 comments

If one has CPU load for 50 pods, then it is not a usual project I guess? Average load should be thousands of requests per second I believe.

So the question is how many applications have CPU load for 50 pods but also cannot saturate a single database, so sharding is not considered yet? Gut feeling is that they are few, more like exceptions.

From my personal experience pgBouncer is used a lot when framework/library does not support application level pooling in the first place, and not so much when it does.

This. For the web service workloads most of us run, I’ve always run out of postgres connections before exceeding the vertical limit of a postgres server.
Is this because your default approach is to horizontally scale or because you have tried other options? Logically scaling vertically a single server with pooled connections should have a better effect than horizontally scaling to multiple servers that need to share connections through a proxy. The proxy doesn't magically increase the number of connections available to the database. It will add overhead to all your database interactions though.
Vertically scaling has a cost component. If you're able to use pgbouncer with transaction pooling to better utilize but not exceed your existing vertical postgres limit, then you're set. If you do, then you must scale up. Pgbouncer can help you lengthen your runway. Tuning your pgbouncer pool sizes can help minimize the overhead but as with anything we do, it's about weighing trade-offs for your situation.
Don't spin up 50 pods. You have outscaled your database. You can't make IO operations faster by throwing more workers at it and you can only have so many connections working at once. As a side note if your application is a typical IO bound web app its very unlikely you can process enough transactions to effectively use 50 workers in a single region.
We're not necessarily talking about 50 pods for the same application, could also be a zoo of smaller low traffic applications sharing a Postgres instance.
In those cases, does one keep the application-level pool at each pod, in addition to the "communal" PgBouncer pool, or does not offer any advantage?