|
|
|
|
|
by ch2026
246 days ago
|
|
1) clickhouse async_insert would have solved all your issues: https://clickhouse.com/docs/optimize/asynchronous-inserts 1a) If you’re still having too many files/parts, then fix your partition by, and mergetree primary key. 2) why are you writing to kafka when vector dev does buffering / batching? 3) if you insist on kafka, https://clickhouse.com/docs/engines/table-engines/integratio... consumes directly from kafka (or since you’re on CHC, use clickhouse pipes) — what’s the point of vector here? Your current solution is unnecessarily complex. I’m guessing the core problem is your merge tree primary key is wrong. |
|
From experience the Kafka tables in ClickHouse are not stable at a high volumes, and harder to debug when things go sideways. It is also easier to mutate your data before ingestion using Vector's VRL scripting language vs. ClickHouse table views (SQL) when dealing with complex data that needs to be denormalized into a flat table.