Hacker News new | ask | show | jobs
by js8 3706 days ago
This is excellent, I was thinking about similar optimization using inverses:

If one calculates some reduction over a data structure, and this reduction is group operator (so it has inverse, like addition), and this data structure is then updated, it may be easier to recalculate the reduction by updating the result through inverse than to recalculate the whole reduction again.

For example, we are calculating sum over a list, we take element out, then it's easier to just subtract the element from the sum rather than calculating the sum again.

Anyway, all in all, I think knowledge of function inverses (and perhaps also isomorphisms between types) by compiler will become very handy in the future, and will enable huge number of optimizations.

2 comments

Thank you. And, heck yes, more people need to be talking about how to deal with computations that are largely repeated. That problem comes up all over the place.

Couchdb had an interesting approach for this that did not require inverses: they persist intermediate reduce results, so that you can re-use nodes in the tree that haven't changed. So clever. http://horicky.blogspot.com/2008/10/couchdb-implementation.h...

Database indices (SQL and no) are similar in that regard - they are persisted so that a single insertion or update does not invalidate the entire index. Cache invalidation is a hard general problem, but easier when a closed system (like a database) is the only thing using its own cache, and the interface is much higher level.
I would love to see an analytics database system built on this idea. Let me define my "incoming" tables, and then define how other aggregated or transformed materialized views should be computed, and automatically update the downstream datasets based on changes in the input tables.

Let me define only "what" to compute, instead of spending all my time worrying about "how" and "when". Isn't that the point of declarative languages like SQL?