| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pawelduda 1204 days ago
	I see this doesn't yet support something I've wanted for some time: What's the simplest way to snapshot a database, perform an operation in app, snapshot it again and get an overview of how many rows were affected in each table? Like an overview of diff. Optionally list changed pkey ids. I tried googling but couldn't find anything like this, so I was thinking of making a DYI solution but wasn't that desperate yet.

5 comments

PaulMest 1204 days ago

I'd guess that https://neon.tech/ could help with this. Neon is branded as "Serverless Postgres". They have APIs to create branches of your database.

So you could effectively:

1) Create a snapshot of your production DB -> DB_2A

2) Then create a snapshot of that snapshot -> DB_2B

3) Now you have two copies of the exact same database. Run your query/workload/migration on DB_2B.

4) Run some metadata queries against DB_2A and DB_2B and compare the results.

5) If your metadata queries are inline with expectations, delete the snapshots. If not, leave them around for a bit for manual inspection.

link

pawelduda 1204 days ago

Interesting. Will check it out in detail. From a quick glance tho I'm afraid that vendor-specific cloud is a no-go because I would rather not upload any client's data in there, even if anonymized. Besides, it's an overkill to convince anyone to switch to a different vendor for sake of diffing DB alone :)

link

ciminelli 1204 days ago

We were looking at tooling for snapshotting production data for testing purposes, that use case is interesting to have the diff view based on changes happening from application actions. Would you use it for testing changes or more for debugging production issues?

link

pawelduda 1204 days ago

Well, my main use case would be to speed up project onboarding where I can play around with the app from user perspective and check how my actions impact database... Could help cutting through a lot of frontend/backend layers and just focus on raw data.

If you have multiple DB snapshots and WAL enabled, theoretically you could inspect the log to see what happened inbetween. There's pg_xlogdump for that but I think it will output very raw data...

link

ciminelli 1203 days ago

How would you want to actually look at the database diff? Just a summary view of rows that were changed given a certain time period?

link

pbronez 1203 days ago

Dolthub does that https://www.dolthub.com/

link

krembo 1204 days ago

Why not use aws rds snapshots to create a new replica in no-time?

link

klysm 1204 days ago

I wonder if you could utilize templates to snapshot the DB.