| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by saisrirampur 487 days ago

Thanks, Nathan, for chiming in and for all the support during the private beta! <3

Overall, what you shared makes sense for use cases like yours. However, there are other scenarios—such as multi-tenant SaaS analytics running large-scale workloads with PeerDB/PostgreSQL CDC. In these cases there are 100s of tables across different schemas that are synced using CDC. Some customers denormalize tables using materialized views (MVs), which is a powerful feature in ClickHouse, while others power dashboards directly with JOINs using the recent JOIN improvements in ClickHouse and suitable/optimized order keys (tenant_id,id).

When dealing with 100s to 1000s of tables and a heavily relational schema, building dual-write pipelines with denormalization becomes extremely difficult—especially when the workload involves UPDATEs.

We have many customers falling in the above bucket, replicating multiple petabytes of data to ClickHouse. A few customer deep dives on this are coming soon! :)

Side note: We are tracking support for in-transit transformations as a future feature. However, MVs are the way to go—more of an ELT approach.