Hacker News new | ask | show | jobs
by vaughan 1289 days ago
The query planner can always change how it runs and is a black box. Nodes in the plan thus cannot be cached. This means that for streaming we are usually re-running the entire query, or doing some custom stuff.

When a source table changes, I want it to automatically and efficiently update anything that would change. I think pretty much every system would prefer real-time stuff like this.

If you are doing this with SQL you will start looking into "Incremental View Maintenance" which is quite complex and still quite heavy.

Then you realize that if you take control instead of handing it off to the planner, you could code your joins and transformations by hand, cache intermediary steps as needed, and have a query that is more efficient than SQL.

But for this I would argue you need a better way to visualize your data flow and the dependency graph, because people can easily write slow imperative stuff.