|
|
|
|
|
by szarnyasg
753 days ago
|
|
I used Rete in my PhD work, where I designed incremental view maintenance techniques for the openCypher graph query language (https://szarnyasg.github.io/phd/szarnyasg-phd-dissertation.p...). Rete is an elegant algorithm that is also quite flexible: one can make non-distributive aggregations, outer joins and recursive traversal work. However, it is quite memory-hungry due to the large state of certain nodes (e.g., join operators). I tried to work around this by using distributed execution, but that made the system very inefficient and complex to operate. I also looked into the TREAT algorithm but found that it is more limited in its functionality than Rete. For incremental view maintenance, the state of the art has now moved on to techniques such as differential dataflow or DBSP (https://www.vldb.org/pvldb/vol16/p1601-budiu.pdf). |
|
Incremental view maintenance is different enough from rule composition and evaluation that the model diverges to be more optimal. Collections of tuples and instead of DAGs lots of cyclical loops to continue computation of the diffs.
There are deep and intrinsic space or time trade offs so many of the modern approaches moved toward natural dataflow concurrency, and streaming semantics where space or time trade offs can be chosen at runtime through batching and data context opposed to early RETE variations which were very OOP and eagerly evaluated instead of lazy (all in memory in the same place instantiated and mutated).
It'll be interesting to see where these differential dataflow approaches go as we head into local-first constraints where central authority on data isn't possible and long divergence of synchronization occurs. Lots of CRDTs in this future for sure. E.g. https://github.com/RhizomeDB/rs-rhizome / https://fission.codes/blog/fission-reactor-dialog-first-look...