| HN Mirror

A long time ago, we were trying to compare a couple of tables with a few hundred million rows in each to see whether the differences (due to a new way of processing) were allowable. Our local Oracle Boy whipped up a query, set it running, and we all sat around for hours whilst it churned - end result being we could do one comparison a day. After a while, I experimented with dumping the tables as CSV, through `sort`, and then using some Perl to compare each paired (or not!) line with some heuristics for quick rejection. That all took about 1-2 hours meaning we could get through three, maybe four, tests a day instead.