Hacker News new | ask | show | jobs
by malthejorgensen 295 days ago
Don’t you have to manually “refresh” Postgres materialized views, essentially making it an easier to implement cache (the Redis example in the blog post) rather than the type always-auto-updating materialized view the blog post author is actually touting?
6 comments

The real bummer is not that you have to manually refresh them, it's that refreshing them involves refreshing the entire view. If you could pick and choose what gets refreshed, you might just sometimes have a stale cache here and there while parts of it get updated. But refreshing a materialized view that is basically just not small or potentially slightly interesting runs the risk of blowing your write instance up.

For this reason I would strongly advise, in the spirit of https://wiki.postgresql.org/wiki/Don't_Do_This, that you Don't Do Materialized Views.

Sure, Differential/Timely Dataflow exist and they're very interesting; I have not gotten to build a database system with them and the systems that provide them in a usable format to end users (e.g. Materialize) are too non-boring for me to want to deploy in a production app.

This is where Oracle has a upper hand in the materialised view department: a materialised view can be refreshed either manually or automatically, with both, incremental and full, refresh options being available.
Yeah, but then you're using Oracle.
Out of the box, you're right, but there are extensions that do just that:

https://github.com/sraoss/pg_ivm

It's however not available on RDS, so I've never had the chance to try it myself.

I think it's impossible to do an incremental update in an arbitrary case. Imagine an m-view based on a query that selects top 100 largest purchases during last 30 days on an e-commerce site. Or, worse, a query that selects the largest subtree of followers on a social network site.

Only certain kinds of conditions, such as a rolling window over a timestamp field, seem amenable to efficient incremental updates. What am I missing?

That's probably beyond the scale level appropriate for a materialized view. For that I'd use something like DBT.
Yes, you need to refresh the materialized views periodically. Which mean that, just like any other caching mechanism, you're solving one problem (query performance) but introducing another (cache invalidation). I've personally used Postgres MVs to great success, but there are tradeoffs.
So the author is wrong that they’re automatic kept in sync?
No, they conflate the two concepts together, though they acknowledge this is a special case here:

> There are a few startups these days peddling a newfangled technology called “incremental view maintenance” or “differential dataflow”

I think they should be a little more explicit about the differences though, because it can be very misleading for those who arent aware of the distinction.

Oh interesting, I didn’t know that - I’ve been so far in MySQL/Vitess land for so long, I haven’t used Postgres in several years. That’s disappointing!