| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by andrewstuart 1859 days ago

It would be nice to hear how much of problem XID wraparound is in Postgres 14 - do the fixes below address it entirely or just make it less of a problem?

I see no mention of addressing transaction id wraparound, but these are in the release notes:

Cause vacuum operations to be aggressive if the table is near xid or multixact wraparound (Masahiko Sawada, Peter Geoghegan)

This is controlled by vacuum_failsafe_age and vacuum_multixact_failsafe_age.

Increase warning time and hard limit before transaction id and multi-transaction wraparound (Noah Misch)

This should reduce the possibility of failures that occur without having issued warnings about wraparound.

https://www.postgresql.org/docs/14/release-14.html

1 comments

petergeoghegan 1859 days ago

Co-author of that feature here.

Clearly it doesn't eliminate the possibility of wraparound failure entirely. Say for example you had a leaked replication slot that blocks cleanup by VACUUM for days or months. It'll also block freezing completely, and so a wraparound failure (where the system won't accept writes) becomes almost inevitable. This is a scenario where the failsafe mechanism won't make any difference at all, since it's just as inevitable (in the absence of DBA intervention).

A more interesting question is how much of a reduction in risk there is if you make certain modest assumptions about the running system, such as assuming that VACUUM can freeze the tuples that need to be frozen to avert wraparound. Then it becomes a question of VACUUM keeping up with the ongoing consumption of XIDs by the system -- the ability of VACUUM to freeze tuples and advance the relfrozenxid for the "oldest" table before XID consumption makes the relfrozenxid dangerously far in the past. It's very hard to model that and make any generalizations, but I believe in practice that the failsafe makes a huge difference, because it stops VACUUM from performing further index vacuuming.

In cases at real risk of wraparound failure, the risk tends to come from the variability in how long index vacuuming takes -- index vacuuming has a pretty non-linear cost, whereas all the other overheads are much more linear and therefore much more predictable. Having the ability to just drop those steps if and only if the situation visibly starts to get out of hand is therefore something I expect to be very useful in practice. Though it's hard to prove it.

Long term, the way to fix this is to come up with a design that doesn't need to freeze at all. But that's much harder.

link

andrewstuart 1859 days ago

Very interesting thanks for the update - how great is the Internet to hear directly from the developer!

It's a pity this wasn't listed in the announcement as I think alot of people are interested in this issue.

>> Long term, the way to fix this is to come up with a design that doesn't need to freeze at all.

Do you know if anyone is turning their attention to this or is it not currently being tackled by anyone?

link

petergeoghegan 1859 days ago

> Very interesting thanks for the update - how great is the Internet to hear directly from the developer!

I see the names of a few people that also work on Postgres on this thread. We're not all that hard to get a hold of if you're a user that has some kind of feedback or question, for what it's worth. The culture is very open in that sense.

> Do you know if anyone is turning their attention to this or is it not currently being tackled by anyone?

This is one of the goals of the zheap project. I myself have some very tentative ideas for tackling it within the standard table access method, heapam. I have not specifically committed to working on it on any timeframe. I haven't completely convinced myself that the approach I'm thinking of is truly robust and practicable. It's pretty complicated, especially because I cannot really know what will break and need to be fixed until I spend significant effort on the implementation.

link