|
Thank you for your comments, I appreciate it. I'm still not sold, however. I would like to understand the underlying principles, "how this works". I don't need implementation details (happy if they are shared, though) but more on the main principles of operation. Please see my further comments below: > Git is very bad at analyzing SQL diffs. Agreed, nothing against. So PS has built-in a nice SQL diff. Neat! But what this really brings? I mean, it's not that there aren't SQL diff tools, tools to manage DDL migrations. Besides this, why not layer it on top of Git? Many orgs and integration tools already have similar workflows (e.g. approval workflows, issue management tools, CI, etc) and if instead of coming up with a new system it would be a layer on top of the existing ones, it would probably have less friction to use. Just my perspective on this, of course. > Run concurrently to your production traffic Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data? > Will automatically throttle when your production traffic gets too high, and in particular taking care not to affect replication lag Same as above: who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify? We run customers that do dozens to thousands of transactions per second. Is this high enough? Will their migrations ever run, or will wait for very long periods of time, maybe forever? > Will run completely lockless throughout the migration How is this possible? Where the migration is running, then? A shadow table, shadow server... none? > At cut-over point What's cut-over? Are groups of servers switched? This is what it sounds to me, and that would explain how it could be lock-less and not affecting production traffic. However, it does not explain how data is synchronized from the production database to the migration branch, nor how it keeps being updated with the real production traffic. This is essentially the crux of me failing to understand how this system works. In general, I apologize if these are too many questions. But in essence, I feel this all sounds really well, but unless I have a deeper understanding of how the principles work, and they are sound to me, I won't be able to recommend this for production usage, as I know from experience the many caveats migrations have. If they are all solved, hats off, but I would appreciate if from a technical perspective this would be more clearly explained. Thank you! |
If so, this is cool. I still see some caveats:
* One already mentioned, the scope of migrations is limited to those where both old and new DDL are compatible with the currently running application. If this is the case, I believe it should be clearly advertised as such.
* Being the migration asynchronous, I lose control of when to deploy changes to the application. Even a hook would go a long way, to trigger this.
* Not knowing exactly then the cut-over process is going to happen is also potentially a problem. I understand the cut-over may involve performance degradation (e.g. higher latency) or even connection loss (may you also confirm PS how it is performed?) during some period of time, possibly small. But still, I may need to plan a small maintenance window. But if this is async, I cannot plan the window appropriately.
Neither of this takes away any merits from the solution.