Hacker News new | ask | show | jobs
by hyuen 3511 days ago
The main problem here is that autovacuum threshold is something like c + m * nrows, and in large configurations you could have all sorts of table sizes.

How much change is a lot? 1% of the table + 50 rows (for small tables)? I would argue that sometimes is better to use a fixed threshold, e.g. c = 1000, m=0

All these approaches are hit or miss and are different per configuration. What I found useful is to choose the best parameters you can think of without forcing autovacuum to run everytime, and have an external job run vacuum manually to cleanup whatever got missed... eventually you can figure out the right configuration.

2 comments

I think the idea for this is either tune it for your problem cases, or if it's an issue at your scale, divide and conqueror your use case (small tables in one physical DB, large tables in another).
I don't understand what problem you have with the combination of threshold + scaling factor (and what would be a better solution). There are no perfect default values for those parameters, as it really depends on workload patterns.

The truth that on most systems you have four types of tables - (large,small) x (frequently modified, static). And usually there are only a few large, frequently modified tables, while the rest is either small or static (or both). At least that's what I see on the multi-TB databases we manage.

There are two approaches:

1) tune the defaults to be aggressive enough even for the large+updated tables (which works because the small small / infrequently modified don't really require that much maintenance, compared to the first category)

2) keep defaults that are fine for majority of tables, and then use ALTER TABLE to override the parameters for the few tables that require more aggressive maintenance

Which is exactly what the blog post is about.

If you have better idea, I'd like to hear it (and I mean that seriously).