| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by quodlibetor 2280 days ago

> if you give it a query that only requires certain result rows from one of its mat views, then Materialize is only going to compute the intermediate rows

This is absolutely correct!

> You can just have a bunch of “the same” Materialize node (i.e. every node just freestanding clone of a template node, with exactly the same sources and matviews) and then hit them with the parts of a map-reduce query

This should work, but we have been thinking about it/testing it differently internally. In general you should be able to create materialized views on different "shards" that have different `where` conditions, allowing you to control memory that way. This technique does require data that is actually partitionable in this way, same as it must be partitionable in mapreduce.

> this is all irrelevant the moment you write a query that needs a pure reduce

Of course, with materialize's sinks you can spin up a bunch of `materialized`s and connect them for a final reduce after data has gone through e.g. kafka or shared files. Being able to write joins and aggregates across heterogenous sources makes this kind of workload actually pretty pleasant.