|
|
|
|
|
by penagwin
2321 days ago
|
|
Question: Can this clone a database, but also apply certain operations to it? I work at a small company, and currently we clone our production database to our dev machines for testing. However certain information in the database is sensitive and we don't want to include it on our dev machines. (This specific sensitive data is also stored in an encrypted format and the key is not included, but we'd still prefer it not included). Basically I'd like to be able to clone the database, and run some SQL to replace the data in some tables with some mock data. But I can't think of an easy way to do this without cloning the database, starting a temporary one, run the sql, then clone it again - and distribute that final build to the devs. |
|
This is requested quite often – for example, if we copy the database from production, sometimes it's needed to remove all personal data not to break regulations.
It is possible in Database Lab, but it's not a very user-friendly feature yet. Briefly, the process is as follows.
The "sync" Postgres instance is configured to be a production replica (better using WAL shipping from the WAL archive). Then, periodically, a new snapshot is created, currently it's done using this Bash script: https://gitlab.com/postgres-ai/database-lab/-/blob/master/sc.... (We are going to make it a part of the database-lab server in the upcoming releases).
Here https://gitlab.com/postgres-ai/database-lab/-/blob/master/sc... you can place any data transformations, so the final snapshot that will be used for thin cloning has adjusted data sets. For example, all personal data is removed or obfuscated.
Of course, if you do this, you need to keep in mind that physically, you'll have a different database. It may affect some kinds of testing (for example, troubleshooting bloat issues or some cases of index performance degradation). There are various choices to be made here. If interested, we'll be happy to help, please join our community Slack which is mentioned in the docs and README.