Hacker News new | ask | show | jobs
by johbjo 1831 days ago
The point here is to do migrations live while clients randomly connect and stay connected for random time. When all old clients have disconnected, the old schema can be removed.

The point is not to let old clients work indefinitely.

In other versions of this idea, all views/procedures are prefixed with some revision-number but that looks messy.

1 comments

But what happens when schema Xn adds a column N, and schema Xn+1 removes it? The base schema, the source for all the views will contain all the columns Xn... from all the migrations until purged.

That purging is what has a lot of operational complexity. And renaming a column. All the rest is zero-downtime in PG already.

We have automated updates of schema. Installs are on-prem, and we support running multiple versions of our program at the same time (if there's a bug in new version, just use previous as workaround).

We rarely drop columns due to this, but haven't found it to be a huge issue. If we need to, we'll wait until the last supported version that has the column in its schema is retired. Then we'll add an explicit "drop column" to our schema file for the next version.

Views can perform arbitrary queries. They can achieve any kind of inference starting from the base data that's stored in tables. (Keep in mind that modern SQL is fully Turing-complete.) This makes any view-augmented database a viable source of rich, complex knowledge, not just mere information.
But I doubt unleashing the full expressive power of SQL is the point here. It would easily turn a moderately complicated "remove column" migration a real maintenance hell.

In the simplest form, when you add a column to some schema, it should be materialized in the base schema and exposed via the migration view.

The problems start when you add and remove 20 columns because even though they are no longer visible in migration schemas, they take up space in the base schema

@lallysingh, sorry, can't reply directly.

All I'm saying is that I don't think we need a full Turing-complete cannonball to hit the (relatively) small fly of no-downtime migrations.

Is it a silver-bullet? It's Turing-complete so high chances yes. But for me it has a high risk of causing a silver-poisoning.

Personally, I would stick with simpler solutions.

Butting in to say: Maximum reply depth is relative to the view. Just click on the comment's time and you will get a form with a reply field.
WOAH! I’ve been using HN for years and had NEVER realised that. And I honestly keep seeing people saying they can’t reply, either. This changes things!
Yes, this approach might actually decrease the power of your database as some things become much trickier. It might be worth trading of some expressive power of the database for smoother, less hands-on migrations though, which is what I'm going for.
That sounds like problems upstream of this technique. Are you complaining that it isn't a silver bullet or that it's so powerful that it'll be abused?
It would have to be in steps:

Add Xn+1: create new views without column. When no Xn clients remain: drop column from tables and drop schema Xn.

The point as I see it is to not break live client connections which expect the column to exist.

Yes, this is a very important insight! I think the author proposes this approach as replacement for regular schema migrations, whilst both approaches complement one another.

You can use views to make migrations that were previously tricky zero-downtime.

If that's not the case, then I mist've read the article wrong!

Edit: although when I think of it - if you want to eventually materialize old migration schemas into the base schema, you need to do the rename, too. Which is not zero-downtime because of new migration schemas that do the renaming automatically. Meaning changing views' definition, meaning lots of locking.

So, you still need maintenance windows to merge all the changes. Just not on every change. Otherwise the base schema will then eventually be completely out of sync and contain tons of old, unused columns.

You're right that the article doesn't mention how this can actually be practically used for migrations, but that is exactly what I had in mind. Maybe I'll write another one some day about the two-phase migrations combined with views.
But now you have the ability to merge multiple maintenance windows into one. Do for example the next 4 migrations with the view trick and the 5th is everytime applying all accumulated changes to the real tables.
Looking forward to your future blog posts! The rename thing got me really interested.
This is precisely the way I was imagining! I'm experimenting with creating a migration tool which will do this fully automatically.