| Excellent question. I do test my backups and restores on a rather constant basis. Each environment within my infrastructure takes a bit of a different approach. Application This is by far the easiest for me to test. We have a CI/CD jon which literally makes a new environment, from scratch, and deploys our application to it in a production configuration. It runs a test suite which tests functionality across the application. Finally, it destroys the environment. It reports on each portion of the process. In this way we know exactly how long it would take to redeploy the entire application from scratch on a new infrastructure and get it up and running. This morning it took about about 6 minutes total before tests ran. Database We are running an RDBMS. We use a combination of daily full backup, incremental transaction log like backup, and point in time backup. Again, in our CI/CD when a full backup is taken it is pulled, loaded, and a test routine is run against it to check integrity. At this time, the recovery from the day before is destroyed. When a transaction log backup is made, CI/CD picks up this change and applies it to the full backup restore and runs a set of tests for integrity check. This leaves us with a warm standby ready to be switched over to in case of the main database server going down. We have never had to use the warm standby in an emergency but we have a test to make sure we can cut that over as well. For point in time backup testing this goes back to our application test above. The application test will spin up with a point in time recovery of the database backup. It will test the integrity of that recovery and then test the application against it. Finally, it will swap from the point in time recovered database to the warm backup. It runs the test suite against that for integrity as well. File Store People often forget this but those buckets that get hold all of your file storage in the cloud can be destroyed so easily (sad, sad experience taught me this). We test those as well. I am sure you can guess at this point how we do that? CI/CD. It's a rather simple process with a ton of gain. A few notes People always ask me this, so I will answer it first. Yes this costs money. It's not as bad as running a second production environment. But it will cost you a bit. My follow up question is, how much does downtime cost you? My CI/CD is always Gitlab CI at this point. I've used Jenkins. I've used Travis. I like Gitlab CI. You can do all of this with any of those. We script literally everything. Computers are so good at repetitive tasks. Why would you EVER do anything manually? Really. If it has to do with your infrastructure, script it. If anyone has any questions about these ideas, feel free to reach out. |