|
|
|
|
|
by grammr
2982 days ago
|
|
Hi there! I'm one of the PipelineDB founders. This description is correct. The unique thing about PipelineDB is that it doesn't store granular data. Once all aggregates are incrementally updated, the raw input rows as discarded and only aggregate output is stored. This approach dramatically limits disk IO and long-term storage requirements, and enables super high performance in most cases on modest hardware. PipelineDB has been used in production for nearly four years now and is used by Fortune 100 companies. |
|
My hunch says that it's possible as far as there is some additional computation done with the future aggregate query on the coordinator in Citus.
PPDB looks interesting, but we also need to keep the underlying raw data and multiple clusters require more complex pipeline.