Hacker News new | ask | show | jobs
by gianm 2284 days ago
Druid committer here. Fwiw, Druid was designed to run on huge clusters and that really shows up in the multi-process architecture. The idea is that if you separate the components needed for ingestion, historical processing, query routing, and coordination, then there are two benefits: they don't interfere with each other (spikes in ingestion load won't interfere with ability to query historical data), and also you can scale each one individually for your workload. You could even auto-scale some of them. For example, the original Druid cluster was operated with load-based auto-scaling for the ingestion processes.

That being said we are currently working on reducing the number of processes to 4 (from the current 6) for a "standard" setup. The main reason is that at smaller scale there isn't as much of a purpose to having a larger number of processes.

We're also working on removing some of the knobs. Actually, depending on what version you originally looked at, many of them might already be gone.

1 comments

It's been a few years since we evaluated Druid. It's great to hear that you're simplifying things, especially for smaller setups!