Hacker News new | ask | show | jobs
by heavenlyhash 3585 days ago
The biggest issue I have with this model -- and git-flow as an opinion about how deploy works in general -- is that it doesn't take any account of rollbacks, and it doesn't scale well to arbitrary numbers of deployments.

The history of my production deploys is not monotonically forward: if something breaks, it rolls back. Nor does production roll forward as one piece: different components of the stack roll in separate motions (the db schema vs the frontend servers for example).

Git, tied to the project dev history, does not well represent these things. Reverts in the deploy branch are not semantically identical to rollbacks in production: and it's not necessarily safe or wise to merge them back into dev history.

A separate git repo, referencing release numbers, or dev repo commit hashes, would work pretty well on the other hand...

2 comments

I'd say rollback is possible under a 'git driven' workflow.

That said, sincerely I find rollback one of those inherently complex ideas:

- Rolling back assumes going back to the previous commit will fix everything, an unproven (unprovable?) hypothesis in the face of database migrations, job queues, etc.

- Making database migrations reversible can be nearly impossible (particularly at scale), aside from a significant engineering effort (for something that should absolutely never happen)

So I just don't contemplate the possibility of rolling back a deploy.

Instead I try things (particularly migrations) on staging rigurously:

- staging environments always ephemeral - created from scratch for a given relase

- always load fresh production DB into staging

- check that all my model objects are still `.valid?` (http://api.rubyonrails.org/classes/ActiveRecord/Validations....) after the migration

- leave staging running a few days.

- if you really can (not easy), forward production traffic to staging as well.

If things go wrong (which under my proposed discipline would be a massive screw-up), then the fix would require analysis, a regular fix (no time travelling), and a regular deploy.

Reacting instantly (i.e. without analysis) is kind of delusional thinking. I'd rather stay broken a little longer for avoiding further complications!

I heartily agree with your sentiment "rollback one of those inherently complex ideas" :) It's true that sometimes it's not even well-defined, such as for database migrations.

Some of this also depends on context. If I'm shipping a single primary deployment of a massive fairly monolithic SaaS product, I can do this time-marches-on stuff. If I'm actually shipping shrinkwrapware -- and as a sibling comment says, doing rolling blue-green deploys also looks like this, if briefly -- switching something back to a previous code version is very worth minding.

Code rollbacks are about immediate mitigation, not about pie in the sky snapshot rollback. If you are sane about deployment and don't go to 100% of traffic instantly, then halting a broken deploy and rolling back is certainly better than shifting into analysis mode.

> I'd rather stay broken a little longer for avoiding further complications!

Unfortunately analysis is slow and an unbounded process, and high leverage businesses where every second of downtime has actual measurable loss simply cannot accept this trade-off.

Good point. Few solutions are apt for every scale and every business!

OTOH, if you really essay a given deployment again and again, you can become really confident that the operation will succeed in production.

Real example: the most important feature I've developed this year has been put 5+ times in staging across a couple months. Every time I've asserted all kind of stuff, gathered feedback from the business owner, etc.

The deployment going bad in production is just not a possibility.

At a larger scale than mine, I would probably introduce 'dark launching' as well. That would further reduce the possibility of needing rollback.

Could that not just be shuffled into the CI side of things

issues occur -> redeploy last-1 deployment