|
|
|
|
|
by rembicilious
701 days ago
|
|
From the article:
“If the user isn’t actively working on the database, PostgreSQL won’t generate a WAL file each minute. WAL files are by default 16 MB and PostgreSQL waits for the 16 MB to fill up. So, your restore granularity could be much longer, and you may not be able to restore to a particular minute in time. You can overcome this problem by setting the archive_timeout and forcing Postgres to generate a new WAL file every minute. With this configuration, Postgres would create a new WAL file when it hits the 1-minute or 16 MB threshold, whichever comes first. The second issue with backup/restore is “no activity”. In this case, PostgreSQL wouldn’t create a new file even if the archive_timeout is set. As a solution, you can generate artificial write activity by calling pg_current_xact_id().” Can you explain why to create a WAL file even though there is no activity? |
|
Then, as @pintxo, @bjornsing, and @Doxin asked below; Why not let users pick a time and then replay the closest backup? Why create empty backups? This is a valid question.
The answer is that without an empty backup, it's not possible to confirm there was no write activity. Maybe some write activity occurred but you fail to archive the WAL file. You need at least one WAL file (empty or not) with a timestamp equal to or more recent than the target restore time to ensure there were no write activity.
I didn't explain this in detail in the blog post for the sake of brevity, but you don't need to create an empty WAL file for each minute with no write activity. For instance, if there was no write activity for 10 minutes and then a WAL file is generated, PostgreSQL can deduce there was no write activity during those 10 minutes and is able restore to any minute within that period. However, if the user wants to restore to 5 minutes ago and there has been no write activity for the past 10 minutes, PostgreSQL doesn't have a WAL file to confirm lack of write activity. In this case, you need to call pg_current_xact_id() to generate a WAL file. So the trick is, when you get a restore request, call pg_current_xact_id() only once, just in case if there were no write activity.
This is a good example of difference between running PostgreSQL for yourself vs. others. When running PostgreSQL for yourself, such edge cases is not a concern. However, when managing it for others, an edge case like this could prevent a restore and create an operational burden. Automating this process is required especially if you are managing lots of PostgreSQL databases.