Hacker News new | ask | show | jobs
by erikcw 11 days ago
Thanks for releasing this! How do you handle DDL queries? Are table changes synchronized to the Iceberg table automatically?

Also, I recently started looking into olake[0] to serve the same purpose. What would you say differentiates Streambed?

[0] https://github.com/datazip-inc/olake

1 comments

Thanks for the kind words!

Short answer: yes, column-level schema changes sync to Iceberg automatically[0].

Logical replication (pgoutput in v1) doesn't actually stream DDL statements. Instead, Postgres emits a fresh Relation message describing the table's current column layout right before the next change to that table. So we diff that against the last layout we knew and infer what changed.

From there we evolve the Iceberg schema in place: flush any buffered rows under the old schema first, then write a new metadata version with the change. What's handled today:

  - ADD COLUMN — new field ID allocated; the column's Postgres DEFAULT is carried into Iceberg's initial-default/write-default, so existing rows read back correctly
  - DROP COLUMN — removed from the current schema, existing data files untouched
  - Type widening — int4→int8, float4→float8 (the changes Iceberg considers compatible)
  - REPLICA IDENTITY changes
[0] https://github.com/viggy28/streambed/pull/21