| Hi folks. Frank from Materialize here. The main differences you should expect to see are generality and performance. Generality, in that there are fewer limitations on what you can express. Oracle (and most RDBMSs) build their Incremental View Maintenance on their existing execution infrastructure, and are limited by the queries whose update rules they can fit in that infrastructure. We don't have that limitation, and are able to build dataflows that update arbitrary SQL92 at the moment. Outer joins with correlated subqueries in the join constraint; fine. Performance, in that we have the ability to specialize computation for incremental maintenance in a way that RDBMSs are less well equipped to do. For example, if you want to maintain a MIN or MAX query, it seems Oracle will do this quickly only for insert-only workloads; on retractions it re-evaluates the whole group. Materialize maintains a per-group aggregation tree, the sort of which previously led to a 10,000x throughput increase for TPCH Query 15 [0]. Generally, we'll build and maintain a few more indexes for you (automatically) burning a bit more memory but ensuring low latencies. As far as I know, Timescale's materialized views are for join-free aggregates. Systems like Druid were join-free and are starting to introduce limited forms. KSQLdb has the same look and feel, but a. is only eventually consistent and b. round-trips everything through Kafka. Again, all AFAIK and could certainly change moment by moment. Obviously we aren't allowed to benchmark against Oracle, but you can evaluate our stuff and let everyone know. So that's one difference. [0]: https://github.com/TimelyDataflow/differential-dataflow/tree... |
I think the Continuous Computation Language (CCL) name captures the essence of these systems: data flows through the computation/query.
These systems have always had promise but none have found anything but niche adoption. The two most popular use cases seem to be ETL-like dataflows and OLAP style Window queries incrementally updated with streaming data (e.g. computations over stock tick data joined with multiple data sources).
[1] https://en.wikipedia.org/wiki/StreamSQL
[2] https://help.sap.com/doc/PRODUCTION/e1b391d2a3f3439fbab27ed8...