|
|
|
|
|
by ashug
872 days ago
|
|
I'm not familiar with Kinesis's sink APIs, but yes I'd imagine you'll have to write your own connector from scratch. To answer your question, though, no: in the Kafka connector, the frequency of inserts into ClickHouse is configurable relatively independent of the batch size, so you don't need massive scale for real-time CH inserts. To save you a couple hours, here's an example config for the connector: # Snippet from connect-distributed.properties
# Max bytes per batch: 1 GB
fetch.max.bytes=1000000000
consumer.fetch.max.bytes=1000000000
max.partition.fetch.bytes=1000000000
consumer.max.partition.fetch.bytes=1000000000
# Max age per batch: 2 seconds
fetch.max.wait.ms=2000
consumer.fetch.max.wait.ms=2000
# Max records per batch: 1 million
max.poll.records=1000000
consumer.max.poll.records=1000000
# Min bytes per batch: 500 MB
fetch.min.bytes=500000000
consumer.fetch.min.bytes=500000000
You also might need to increase `message.max.bytes` on the broker/cluster side.If you're still deciding, I'd recommend Kafka over Kinesis because (1) it's open source so more options, e.g. self host or Confluent or AWS MSK and (2) it has a much bigger community, meaning better support, more StackOverflow answers, a plug-and-play CH Kafka connector, etc. |
|