Hacker News new | ask | show | jobs
by moonikakiss 406 days ago
With the push towards open table formats (Iceberg) from both Snowflake and Databricks, it's even harder to get your Postgres OLTP tables ready for OLAP.

The problem isn't in the CDC / replication tools in the market.

The problem is that columnar stores (especially Iceberg) are not designed for the write /upserts patterns of OLTP systems.

They just can't keep up...

This is a big problem we're hoping to solve at Mooncake [0]. Turn Iceberg into an operational columnstore. So that it can be keep up (<s freshness) with your Postgres.

https://www.mooncake.dev/

1 comments

Is Iceberg involved in every read/write? I thought it was mostly metadata?
DataFile(parquet) is not enough for table with update/delete, (they are part of iceberg "metadata"). for CDC from OLTP use-cases, the pattern involves rapidly marking rows as deleted/ insert new rows and optimizing small files. This is required for minutes-latency replication.

And for second latency replication, it is more involving, you actually need to build layer on top of iceberg to track pk/ apply deletion.