Hacker News new | ask | show | jobs
by andrea_s 3532 days ago
It's a bit odd there's no mention of the PG columnar store in this article (https://www.citusdata.com/blog/2014/04/03/columnar-store-for...) - especially since it's from the same company.

It would be interesting to see how much the performances improve once you use cstore_fdw (especially since 1M records is quite small when talking about OLAP workloads).

disclaimer: I've never used cstore_fdw, but I have evaluated a number of columnar databases in the past.

2 comments

(Ozgun from Citus Data)

We find that the primary motivation for using cstore is reducing disk I/O / storage footprint. cstore_fdw keeps a columnar layout on disk in compressed form and reads only relevant columns. For example, it's commonly used for data archival purposes.

That said, cstore_fdw doesn't yet make optimizations related to query planning and execution. We made experiments in that direction (https://news.ycombinator.com/item?id=8423825), but making those changes production ready is no small effort.

Since all benchmarks in this blog post are for in-memory data, I don't know how much they would benefit from cstore. If I have the time, I'll give it a try and update this comment with the results.

I think cstore_fdw is not popular enough among Citus users. Only a few of their customers use it since it's not trivial to use cstore_fdw in real-time workloads. Given than its use-case is mainly analytics, it seems a bit odd though.