Hacker News new | ask | show | jobs
by viveknathani_ 11 days ago
interesting approach, was exploring a Postgres to Clickhouse CDC setup while helping a team sometime back, this seems better as it allows separating the compute (query server) and storage (s3) layers, and thereby allowing us to be creative in cost reductions
2 comments

Aside from the cost, my major motivation is to keep the infrastructure simple. The data is already there in Postgres, so I didn't want to add another data warehouse. I have also shared my thoughts on where this is heading https://viggy28.dev/article/postgres-gateway-drug/
It depends on the use case. For real-time, customer-facing analytics, ClickHouse’s MergeTree engine is a natural fit, so a Postgres → ClickHouse CDC setup with low latencies (single-digit seconds) is better.

Replication to Iceberg/S3 is better suited for offline analytics and data warehousing use cases. You can use the same ClickHouse engine to query layer Iceberg data in S3.

makes sense!