Hacker News new | ask | show | jobs
by quilombodigital 1649 days ago
Through the years I have many times worked on this problem. I´ve worked with rake tasks when using ruby, and many times in java, with flyway and my own system. Database Migration is not a complicated problem, you just commit to the repository a sequence of SQL/code scripts to execute and keep at the database the current step. The big misconception people have is that they can "rollback" when an issue arises. The true fact is that when there is an issue, it happened because something not planned happened, and this can be spurious data in production, production environment differences, script errors, or whatever, and trying to rollback automatically in an undetermined state is just crazy. That´s why migrations must be always FORWARD, no matter what. You try to mitigate all risks, the best you can, by running the migration in test databases (the perfect scenario is to perform a test in a production data clone, but this is not usually possible) and if anything wrong happens, deal with it, fix the database by hand, make code patches, and after it is solved, try to figure out why this slipped through your QA.
2 comments

The difference is largely semantic in the sense that you're still going "forwards" with a down migration, but the important bit is that by calling it a rollback you're accepting that the up migration won't be run again if you ever completely tear down the database and start again. Suggesting that you only ever go forwards implies that you'd run the incorrect migration and the fix for it again in the future,but you wouldn't. Calling it a rollback captures that information.
It's not a semantic difference. If you rollback, that's not in the history. That's fine if the rollback perfectly reversed your original migration. But if it doesn't (developers make mistakes), now your database has reached a state that can't be reproduced by running migrations, defeating the entire point.

> Suggesting that you only ever go forwards implies that you'd run the incorrect migration and the fix for it again in the future,but you wouldn't.

I wrote a tool that specifically enforces this for my company, and it's worked well for the past four years in prod, staging and on a dozen dev environments, so I don't think your assertion is true.

> The big misconception people have is that they can "rollback" when an issue arises

I have reasonably working rollbacks, if things go sideways it takes only a few seconds to go undo the latest changes.

I guess this is exactly this misconception. The problem happens when you encounter a scenario where your rollback doesn't work. I.e.:

> when there is an issue, it happened because something not planned happened

In 12+ years it worked every time.

I understand it didn't work for you, but it does work for me.

I'm curious how you implemented this system.
In my SQL Server database I have a VERSION table with one row and one column that contains the current "version" of the database, and a DowngradeSteps table that keeps track of what needs to be done to undo the changes.

In my project source control I have a SQL script SyncSchema.sql. It contains a series of step-pairs (roll-forward + roll-back):

  if (select V from VERSION)<XXX
  begin
      begin tran 
          update V=V+1 from VERSION
          ***Roll-forward: ALTER TABLE FOO ADD BAR INT***
          insert DowngradeSteps values (
              (select V from VERSION), 
              '***Roll-back: ALTER TABLE FOO DROP COLUMN BAR***'
          )
      commit tran 
   end
where XXX started from zero and is going up every time I need to add a change. The last batch in the file is the highest XXX it knows about, and it is therefore the TARGET db version. This script run upon each deployment, and if you ignore the DowngradeSteps/roll-back part is the usual roll-forward scheme.

Now the roll-back part: once SyncSchema.sql runs through all steps up to the TARGET version, it opens up the DowngradeSteps table and executes all roll-back steps contained therein, that exceed the TARGET. This way I can sync my source control to any point in the past and immediately deploy - the C# code is updated to the desired version and the SQL database is updated to the TARGET version, so I enjoy a consistent picture. It takes literally seconds to do.

The roll-forward and roll-back changes are always coded in pairs and are tested and code-reviewed before before being committed.

I have put this scheme in place 12 years ago and it's been serving me well *

* One caveat is that SQL Server sometimes throws a fit due to lame name binding rules (it binds names inside the IF block even when condition is false), so you have to enclose some of the steps into exec(), and you have to do this proactively(!). This sucks, but maybe your database doesn't do that.

Schema(or data in a database) in your release N should be compatible with release N-1, so you can revert your code to previous working version with new schema. That's the only way to ship new releases without stopping service for maintenance.