|
|
|
|
|
by EdwardDiego
1903 days ago
|
|
Okay, I can see that point, but is it worth the additional latency between broker and Bookie? > I'm pretty surprised Twitter didn't see benefit from doing this if they have multiple Kafka clusters with different use cases. Yeah, I think they were too tbh. I wish I could delve more into what they experienced beyond that single blog post I linked. |
|
It might depend on what you're ingesting and how much. Being able to independently scale ingest and storage is a good alternative to have. It's not only ingest though. It's also consumption. As it stands, having a parallel consumer over a large partition spanning several GBs also requires tons of RAM because a segment must be loaded into memory. In that sense, reprocessing historical data is pretty difficult. There's a lot of complexity hidden in additional Druid, HDFS installations or shoe-horned object storage with their own indexing to support access to historical data semi-fast.