Hacker News new | ask | show | jobs
by pwmtr 700 days ago
We want to ensure that the users are able to restore to any minute in the past.

Then, as @pintxo, @bjornsing, and @Doxin asked below; Why not let users pick a time and then replay the closest backup? Why create empty backups? This is a valid question.

The answer is that without an empty backup, it's not possible to confirm there was no write activity. Maybe some write activity occurred but you fail to archive the WAL file. You need at least one WAL file (empty or not) with a timestamp equal to or more recent than the target restore time to ensure there were no write activity.

I didn't explain this in detail in the blog post for the sake of brevity, but you don't need to create an empty WAL file for each minute with no write activity. For instance, if there was no write activity for 10 minutes and then a WAL file is generated, PostgreSQL can deduce there was no write activity during those 10 minutes and is able restore to any minute within that period. However, if the user wants to restore to 5 minutes ago and there has been no write activity for the past 10 minutes, PostgreSQL doesn't have a WAL file to confirm lack of write activity. In this case, you need to call pg_current_xact_id() to generate a WAL file. So the trick is, when you get a restore request, call pg_current_xact_id() only once, just in case if there were no write activity.

This is a good example of difference between running PostgreSQL for yourself vs. others. When running PostgreSQL for yourself, such edge cases is not a concern. However, when managing it for others, an edge case like this could prevent a restore and create an operational burden. Automating this process is required especially if you are managing lots of PostgreSQL databases.

1 comments

Thank you, that makes a lot of sense. I should have intuited that, but I couldn’t quite make the connection. I really appreciate the write up, it gave me a lot to think about as someone who has only ever managed my own DB.