|
|
|
|
|
by jmort
2794 days ago
|
|
Can you say a bit more about "performant" or point me to some information? I haven't found any yet. I'm processing millions of protobufs per second and would love to get away from batch jobs to do some incredibly basic counting -- this seems like a fit conceptually...If its a fit, any recommendations on the best way to get those protobufs off a kafka stream and into pipelinedb would be great, too! |
|
That being said, millions of events per second should absolutely be doable, especially if your continuous queries are relatively straightforward as you've suggested. If the output of your continuous queries fits in memory, then it's extremely likely you'd be able to achieve the throughput you need relatively easily.
Many of our users use our Kafka connector [0] to consume messages into PipelineDB, although given that you're using protobufs I'm guessing your messages require a bit more processing/unpacking to get them into a format that can be written to PipelineDB (basically something you can INSERT or COPY into a stream). In that case what most users do is write a consumer that simply transforms messages into INSERT or COPY statements. These writes can be parallelized heavily and are primarily limited by CPU capacity.
Please feel free to reach out to me (I'm Derek) if you'd like to discuss your workload and use case further, or set up a proof-of-concept--we're always happy to help!
[0] https://github.com/pipelinedb/pipeline_kafka