Hacker News new | ask | show | jobs
by cirego 1878 days ago
> Yep. I meant it was easy to do it the inefficient way where you just refresh the entire query when any table mentioned in the query changes. You would just have to also check if something was a view and recursively parse the SQL that is used in the view. Just use Postgres `LISTEN` and triggers or the WAL for change monitoring.

Unfortunately, it turns out that recursively refreshing views still leads to surprising behavior. I think post summarizes the problem quite nicely: https://scattered-thoughts.net/writing/internal-consistency-.... If you cannot refresh all of the views, at a single point in time, then there will be internal inconsistencies in your dataset.

When looking at automatic refreshing, simple triggers and `LISTEN/NOTIFY` don't scale, as was mentioned in the comment regarding Hasura's multiplexing. I think, in the absence of incrementally maintained views, their multiplexing strategy is a good compromise for databases like postgres. However, it should be noted that continuous query / subscription of views is the exact scenario under which incremental computation will provide both lower latency and greater resource efficiency.

1 comments

> If you cannot refresh all of the views, at a single point in time, then there will be internal inconsistencies in your dataset.

In the simplest case, I'm talking about regular SQL non-materialized views which are essentially inlined.

> incremental computation will provide both lower latency and greater resource efficiency.

Wish we had some better database primitives to assemble rather than building everything on Postgres - its not ideal for a lot of things.

> In the simplest case, I'm talking about regular SQL non-materialized views which are essentially inlined.

I see that now -- makes sense!

> Wish we had some better database primitives to assemble rather than building everything on Postgres - its not ideal for a lot of things.

I'm curious to hear more about this! We agree that better primitives are required and that's why Materialize is written in Rust using using TimelyDataflow[1] and DifferentialDataflow[2] (both developed by Materialize co-founder Frank McSherry). The only relationship between Materialize and Postgres is that we are wire-compatible with Postgres and we don't share any code with Postgres nor do we have a dependence on it.

[1] https://github.com/TimelyDataflow/timely-dataflow [2] https://github.com/TimelyDataflow/differential-dataflow

> I'm curious to hear more about this

I think the [FoundationDB layer concept][1] said it well:

"When you choose a database today, you’re not choosing one piece of technology, you’re choosing three: storage technology, data model, and API/query language..."

I like the idea of [Apache Calcite][2] that provides an API to access the query planner. I think if you had more convenient access to some of the underlying components you could build a lot of cool stuff. There's too much magic where you punch in an SQL command or a config and hope it eventually does what you need.

I haven't look too much into the internals, but I'm keen to do so soon.

[1]: https://apple.github.io/foundationdb/layer-concept.html [2]: https://calcite.apache.org/