Hacker News new | ask | show | jobs
by drchaim 2 days ago
Good stuff, although I’m not quite sure about the fast OLAP use case.

If you’re already sharding by tenant for other reasons, OK… But I see CDC to a true OLAP system as more scalable.

PostgreSQL still needs real columnar tables in the core, hopefully one day

3 comments

OLAP means different things to different people. For us, it's just making sure your admin dashboard keeps working basically:

  SELECT tenant_id, COUNT(clicks)
  FROM users
  GROUP BY tenant_id
  ORDER BY 2 DESC
  LIMIT 25;
Performance is a side effect - definitely needed and we'll do everything we can, but we are not competing with ClickHouse or Snowflake - just trying to make sharded Postgres work with your app.
Tomas Vondra, a major Postgres contributor recently revived a thread on using Bloom filters - https://www.postgresql.org/message-id/flat/5cd8c20c-14b5-4b0...

So there is more core work happening on support OLAP but I do think it will take some time.

In the meantime, I think we have all the pieces (storage, query engine, table format) to set up a true OLAP. For instance, I created https://github.com/viggy28/streambed to pressure test this idea.

Re OLAP: It's probably ~good enough~ for a lean team that's trying to keep the tech stack standard and/or doesn't have a dedicated data person to take advantage of a columnar store.