|
|
|
|
|
by Sep142324
267 days ago
|
|
Dynamic Tables are interesting for declarative streaming. In the ClickHouse ecosystem, you might want to look at materialized views combined with streaming engines. For real-time transformations, there are a few approaches:
- Native ClickHouse MaterializedViews with AggregatingMergeTree
- Stream processors that write to ClickHouse (Flink, Spark Streaming)
- Streaming SQL engines that can read/write ClickHouse We've been working on streaming SQL at Proton (github.com/timeplus-io/proton) which handles similar use cases - continuous queries that maintain state and can write results back to ClickHouse. The key difference from Dynamic Tables is handling unbounded streams vs micro-batches. What's your specific use case? Happy to discuss the tradeoffs. |
|
1. Table A : fact events, high-throughput (10k~1M eps), high-cardinality
2. Table B, C, D : couple of dimension tables (fast or slow changing).
The use case is straightforward : join/enrich/lookup everything into one big flattened, analytics-friendly table into ClickHouse.
What’s the best pipeline approach to achieve this in real-time and efficiently?