Hacker News new | ask | show | jobs
by evanelias 2603 days ago
A totally valid point, but I'd argue those should be handled by a separate tool or process. Data migrations tend to be fully programmatic; tools and frameworks can help reduce the code required, but cannot handle every possible case. (having performed numerous multi-billion-row data migrations, I learned this painfully first-hand...)

For simpler cases, where it may make sense to run a data migration immediately after a schema change, a good generic middle-ground may be configurable hook scripts. A declarative schema management system can then pass relevant info to the hook (which tables were changed, for example) and then the script can run any arbitrary row data diff/apply/migrate type of logic.

I do understand your point though; for relatively straightforward data migrations, an imperative system can capture these much more cleanly by just coupling them with the corresponding schema migration code.

1 comments

I honestly like the way Rails does it: both capturing the imperative deltas and dumping the final schema which gets checked in. Not a big fan of down migrations, usually a waste of time.

Otherwise I like Percona's OSC, particularly how it can tune down table rewrites when there's competing work, or replication is lagging too much. We're just at the point where we need to automate the OSC tool rather than using it as a point solution for migrating our bigger tenants.