| I’ve been thinking about how Postgres is quietly becoming a streaming backbone, not just a transactional database. With logical replication and WAL-based CDC, Postgres can act as a real-time event source. Instead of introducing a separate log system early, many teams now stream changes directly out of Postgres into downstream systems. The architecture is shifting from: Traditional
DB → app → message queue → stream processor To something more like: Postgres WAL → CDC → multiple sinks What’s particularly interesting is using the WAL as the fan-out point. The same stream of changes can be written simultaneously into systems like Apache Kafka for event processing, stream processors like Apache Flink, and lakehouse tables such as Apache Iceberg. In that model, WAL changes effectively become a unified change stream, with Iceberg acting as a long-lived analytical sink while other systems consume the same stream for real-time workflows. So the pattern starts looking like: Postgres WAL
→ streaming pipeline
→ Kafka / Flink
→ Iceberg tables for analytics & historical replay A few things I’m curious about:
• Are people treating Postgres WAL as a long-term system of event truth, or just an integration point?
• Does writing CDC streams directly into Iceberg tables change how we think about building data lakes?
• At what scale does this pattern start to break down compared to adopting a dedicated log system earlier? Would love to hear how others are approaching Postgres + streaming + lakehouse architectures in production. |