Hacker News new | ask | show | jobs
by jacobkg 3032 days ago
This is potentially BIG news. At my last company Postgres bloat was a gigantic problem. We did a lot of data updates and sometimes customer databases (we sharded by schema) would be bloated by as much as 300%. This really hurts when some customers are 50+ GB. Not only does bloat take up more space but it slows down queries.

We would occasionally resort to taking the customer offline for scheduled maintenance to do a pg_dump/pg_restore to get their DB back in shape.

I'm very curious to hear when this might become available for production use cases.

EDIT: Article states this has been proposed for Postgres 12

4 comments

This can indeed be an issue, with both indexes and tables. For tables, I've had success with tuning fillfactor to take into account update patterns. If there's a deterministic subset of rows that are updated more frequently, splitting those into a smaller table/partition, perhaps with a different fillfactor can help as well. This also makes vacuum faster for this subset. Even if there isn't a deterministic subset, partitioning for smaller table size can speed up vacuum regardless.

For indexes, using concurrent reindex to create new indexes and dropping the old can make a difference.

Yeah, it would be nice if more of this were automated and it's good that there's work being done. There are some options available in existing that can ameliorate bloat.

Solid advice! I remember reading about fillfactor but didn't get a chance to try it out on our workload.
Were the tables under your control? Could you set a low fill factor [1] for frequently updated tables (eg 30 or lower)? That would help avoid creating too many db pages.

[1] https://www.postgresql.org/docs/10/static/sql-createtable.ht...

Are you aware of pg_repack [1]? I haven't had a chance to use it myself yet, but it's designed for reclaiming bloated space without downtime.

[1] https://github.com/reorg/pg_repack/

We tested pg_repack early on. Our experience (at least 5 years ago) was it didn't reliably clean up after itself in all cases so we were left with random triggers in the database. With thousands of customer databases this was a huge potential hassle.

I have heard plenty of people use pg_repack so maybe it got better or we didn't test it enough.

We use pg_repack and hit that exact problem (left over triggers etc). In the end we resorted to creating and dropping the pg_repack extension after each run.

(edit: Missed an "and")

Do you have any idea why their docs are not mentioning this, or even why their software aren't doing that itself?
I haven't used pg_repack in production, but on my development box with PgSQL 10.x, if you end the process early, pg_repack will complain on the next run and ask you to clean up by dropping and re-adding the extension. Maybe this occurs in the wild if the cleanup code is not reached, for whatever reason.
We use pg_repack with automated schedules and it's really helpful.
Was it an old version of Postgres? I'm just curious to know why VACUUM couldn't help. Did you try VACUUM FULL?
Vacuum helps with deletes but updates create new rows and vacuum never cleans up the old ones.

Edit: My statement about updates is not correct, see comments below

Vacuum Full works but it requires a full table lock so we could only use it on select tables for smaller customers. For large tables Vacuum Full can take many minutes and once you need to take downtime pg_dump/pg_restore becomes an option.

> Vacuum helps with deletes but updates create new rows and vacuum never cleans up the old ones.

Only if there are long running transactions older than the updates themselves. When no old transactions are left, vacuum reclaims old tuples.

If you're talking about vacuum being unable to reclaim pages that still have some live tuples, that problem affects deletes and updates equally.

Thank you for correcting my misunderstanding!
> Vacuum helps with deletes but updates create new rows and vacuum never cleans up the old ones.

Where did you read this?..