Hacker News new | ask | show | jobs
by spoondan 4284 days ago
This is a really good write-up.

In consulting and mentoring on this topic, I've found a lot of engineers push back against how "dirty" it is to have multiple copies of the data around in different formats. It feels wrong to not have a single, authoritative data format at any given instant. If the idea is to change the column type, why not just `ALTER TABLE ... ALTER COLUMN` instead of `ALTER TABLE ... ADD`?

But if you think about it, excepting trivial cases, once you're migrating data, there are parallel realities at least for the duration of the migration and deployment. It's not a question of whether you create divergence by versioning/staging (in some fashion) your data. It's a question of whether you manage the divergence and convergence of the parallel realities that already exist as part of a migration. If you don't, you either incur downtime or risk data corruption.

One big win here is that, by being disciplined about your code and data changes, you can cleanly separate deployment from release. You can deploy a feature but have it disabled or only enabled for a subset of users. Releasing a feature means enabling its feature flag, not orchestrating a set of migrations, replications, and deployments.