|
|
|
|
|
by pcthrowaway
985 days ago
|
|
I recently had to do a migration on a timescale hypertable where a "schema" was migrated for a table which had jsonb columns containing arrays of arrays of numbers to a new table containing the same data as two-dimensional postgres arrays of numeric[][] data (better storage characteristics) Our workflow was something like: 1) Create the new hypertable 2) Create after insert trigger on first table to insert transformed data from first table into second table, and delete from first table (this ensured applications can continue running using first schema/table, without any new data being added to first table after migration) 3) Iterate over first table in time-bucketed batches using a plpgsql block to move chunks of data from first table to second table. Would pgroll enable a similar workflow? I guess I'm curious if the way pgroll works would similarly create a trigger to allow apps to work with the initial schema as a stopgap... I guess pgroll would perform the whole migration as a series of column updates on a single table, but I'm unclear on whether it attempts to migrate all data in one step (potentially locking the table for longer periods?) while also allowing applications using the old schema to continue working so there is no downtime as changes are rolled out. Has pgroll been tested with timescaledb at all? |
|
1) Create a new column with the desired type (numeric[][] in your case) 2) Backfill it from the original one, executing the up function to do the casting and any required transformation 3) Install a trigger to execute the up function for every new insert/update happening in the old schema version 4) After complete, remove the old column, as it's no longer needed in the new version of the schema
Backfills are executed in batches, you can check how that works here: https://github.com/xataio/pgroll/blob/main/pkg/migrations/ba...
I don't think any of us has tested pgroll against timescaledb but I would love to know about the results if anyone does!