Hacker News new | ask | show | jobs
by nkmnz 148 days ago
It took me years before I started tuning the memory-related configuration of postgres for workload, schema and data, in any way. It "just works" for the first ten thousand concurrent users.
4 comments

PostgreSQL has 2 memory-related parameters you need to set for larger instances - work_mem and shared_buffers, as these need to be set to a percentage of the VMs memory to utilize it well. However, pretty much every PostgreSQL setup guide names these two values, and on a managed PostgreSQL hosting I'd expect these to be set.

Outside of memory, log_duration, temp_file_limit, a good query plan visualizer and some backup and replication (e.g. PGBackrest and Patroni) are also generally recommended if self-hosting. Patroni doesn't even need an external config store anymore, which is great since you can just run it onto 3-4 nodes and get a high quality HA, easy to manage PostgreSQL cluster.

But those two parameters are pretty much all to have a PostgreSQL process thousands of transactions per second without further tuning. Even our larger DBs hosting simple REST-applications (opposed to ETL/Data warehousing) had to grow quite a lot until further configuration was necessary, if at all.

Checkpointing probably becomes the next issue then, but modern PostgreSQL actually has great logging there -- "Checkpoints occur too frequently, consider these 5 config parameters to look at". And don't touch VACUUM jobs, as a consultant once joked, he sometimes earns thousands of dollars to say "You killed a VACUUM job? Don't do that".

So yeah, actually running PostgreSQL takes a few considerations, but compared to 10 - 15 years ago, you can get a lot with little effort.

Exactly! Thank you for sharing your expertise!
I agree, but that doesn’t constitute „carefully tuning“ a config, at least not in my book. OP certainly didn’t mean to imply that it’s enough to follow along a 5min tutorial.
Well, most people working on a car don’t have a car lift: it only makes sense when you need to safely work on a large volume of cars. If you only work on one or two, a jack and a pile of wood works just fine.
Please don't move the goal post. Writing `no database ”just works” without (...)` is gatekeeping behavior, creating an image of complexity that for most use cases - especially for those starting out - just doesn't exist.
In fairness, it doesn’t exist for Elasticsearch either.
I have no clue about Elasticsearch, so you might be right – but on the other hand, you just contradicted your own statement about how difficult databases are, so I have no idea which of your statements I should trust.
Horses are great and run on grass. Formula cars are difficult to maintain. Cruise liners are even worse!
Then maybe you shouldn’t make a single statement about all means of transportation, claiming that you needed to be horse whisperer to ride any kind of bike, car, train, ship or plane.
Modern JVMs are pretty effective in most scenarios right out of the box.