Hacker News new | ask | show | jobs
by hobs 605 days ago
it's a good question because my answer for a system like this which had very little schema changing was just dump it into a database and add historical tracking per object that way, hash, compare, insert and add historical record.
1 comments

I do have the current state in the DB. However I need sometimes to compare today's file with the one from 6 month ago.
So I assumed something like - you have the same schema with the same tabular format inside or the XML document, and that those state changes are in a way so you can tell the timestamp - then you can bring up both states at the same time and compare across the attributes for wrongness.

EXCEPT/INTERSECT make this easy for a bunch of columns (excluding the times of course, I usually hash these for performance reasons) but wont tell you what itself is the difference, you have to do column by column comparisons here, which is where I usually shell out to my language of choice because SQL sucks at doing that.