| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jfim 4016 days ago

Both architectures are indeed similar. The distinction between PQL and JSON is mostly a client issue, as you could have a client that converts a hypothetical Druid query language to JSON.

Pinot does use Helix, as it's been used successfully inside (and outside) LinkedIn to manage distributed state and coordination with a state model that's easily understandable.

To recover from losing the entire ZK cluster (which would be quite a bad operational failure, Kafka and other services depending on ZK would also break), you'd need to recreate your tables, repush your data from Hadoop and start consuming from Kafka again. We only use ZK and Helix for coordination and storing segment-level metadata.

There are some other differences with Druid. For example, when data is pushed into Druid, it can be persisted in another deep storage system (S3, etc.), which is something we don't support at this point in time (it wasn't necessary internally at LinkedIn). We also don't have integration with R, nor documentation that's as extensive as Druid's.