|
|
|
|
|
by SayWhatIMean
3983 days ago
|
|
I think diff is OK as a verification check after a deployment. But diff is not OK to generate the change script itself. It's not about abusing the diff. It's about diff being fundamentally the wrong technique to transition from state (C)urrent -> (T)arget. A diff cannot always bring you to the desired T. The diff only factors in 2 states (C and T) and not X,Y,Z which you may have an import from and which may need to occur during a transitional state before T is reached. It doesn't factor the desired state T may not even exist to diff against if developers move forward adding objects not for this release. It makes things easier on web developers at the expense of deployments. Which "scales" better? Migration tools scale better at letting web-devs free wheel in the DB. For bullet proof deployments migration tools don't scale. They excel at the simple cases but hit a hard limit. BTW any details on what kind of data in production broke a deployment? Heads up to avoid it myself. |
|
As for some of the issues I have seen with teams moving fast or that are larger. In one case we had around 40 commonly used databases, hundreds of tables in most of them, millions of data rows, multiple servers etc. Not saying the DB architecture was ideal, but it was not uncommon for larger systems at mid+ sized Enterprises I have seen as a consultant. Many times it is when an organization is between rapid growth still and getting discipline that you see a lot more of these types of issues (larger teams, fast moving still with a lack of BTDT leadership). The issues we saw between dev, stage and production were 1. the datasets were too large to replicate from Production to each environment, 2. security concerns over too many people having access to some of the data, 3. multiple teams split functionally instead of per sprint.
So how did these cause issues.
1. We would not see all the data in each environment and while we would test across a wide swipe of data and also run queries trying to validate our alter and update scripts it wasn't uncommon every couple of deployments to have DB scripts fail to run because of some data we couldn't test against. The toolset helped us a little here, because the DBA's set it up to auto test changes against the production data before it would let the script run. And every time a script would fail they would add new test cases if it warranted it.
2. Since not everyone could have access to all the data they couldn't predict every outcome, see #1. So the toolset helped us automate what they couldn't see so we could prevent some issues. This is very common when dealing with financial or PII in health care settings. Also, this SHOULD and could have been mostly resolved by not splitting teams functionally #3.
3. Teams were split by their function instead of sprint. We had a data team, analytics team and development, each working mostly independent of each other. Meaning it wasn't rare for the data team to release a script to production in the morning which didn't get applied to dev/staging and then development comes along a few hours later tests and gets sign off on their changes in staging. Then development deploys to production and has scripts fail to run in production or causing some data havoc (or vice versa between the teams). Had the teams been integrated better this would have been less likely but still not impossible. And I hear all the purist saying well the problem was someone skipped stage etc. Yep, it sure is a core problem, but frankly in every environment I have worked I have seen it happen and so to dismiss it as it shouldn't happen is naive. Like saying well I am not testing for any errors in my code because there shouldn't be any, uh huh. So when we used the toolset it would tell us hey, something has changed here, and the DDL looks different than what you said it should look like, abort.
Also, most places I have seen need to routinely do production to stage and dev refreshes to help prevent a lot of these types of issues, which right there tells you that shit happens and it is nearly impossible to plan for all outcomes when it comes to complex systems. So tools are the right thing to help you.
Last point, when we did have major deployments and a lot of DB scripts were going out, the toolset also would run through all the scripts and warn us of conflicts. It would give us false positives sometimes, and miss some things, but it really saved our asses more than once.
Sorry for the length, just wanted to share. FWIW, we agree the simple methods work, its just that tools and frameworks, implemented properly really do mitigate some of the human and process failures we all have from time to time.