Hacker News new | ask | show | jobs
by TOMDM 823 days ago
Postgres is such a great tool.

The feature I'd love to see added that has been kicking around the mailing list for ages now would be incremental view maintenance.

Being able to keep moderately complex analysis workloads fresh in realtime would be such a boon.

6 comments

Materialized views with pg_cron to refresh them? (Or even just cron and your usual interface, if you don't want to install something extra)
This solves the same problem, just not as well, the idea of an incrementaly maintained view is that only the update needs to be computed, so a count will increment or decrement as rows are inserted or deleted.

It means complex views that could take minutes or hours to calculate from scratch can be kept fresh in realtime.

I’ve built a few of these in snowflake for time series data (with dbt, which is fine but I don’t love it). My feeling so far is that the incremental bits are kinda fiddly and often domain-specific. Is it possible to define a single “shape” that would solve >90% of these incremental refresh scenarios?
pg_ivm has a ton of restrictions though (as do other PG incremental view refresh mechanisms, like timescale hypertables).

That's why I assume it's not in vanilla postgres: adding the future with so many caveats would not make for a great experience for the full breadth of postgres users.

Is this feature even possible? How can the database unravel a complex queries with derived data to minimal updates from its normalized parts?

The easy way is to rebuild everything if any “from table” as been modified.

The manual way is to create triggers that perform the minimal updates.

There are lots of projects who have managed to achieve this.

Streaming frameworks like Kafka Streams and Flink have incrementally updating tables in memory.

Materialize is built around the concept with a Postgres compatible API.

ClickHouse materialized views act like insert triggers which update when the base table is updated.

I'm not familiar with ClickHouse materialized views, but the other tech you list (as I roughly understand them) seem more concerned with streaming SQL, which is a related but different end user experience from incrementally refreshed materialized views.
I believe differential data flow techniques (Materialize, Noria) enable this kind of stuff
MERGE was interested in 15, if that's helpful: https://www.postgresql.org/docs/current/sql-merge.html
An "incremental view" is just an index over some custom query. So you're pretty much asking for improvements in Postgres' index support.
He means a materialized view, I believe.
Did you used this in production, how do you rate it?